Mutant endoglycoceramidases with enhanced synthetic activity

ABSTRACT

The present invention relates to a novel endoglycoceramidase whose hydrolytic activity has been substantially reduced or eliminated, such that the enzyme is useful for synthesis of glycolipids from a monosaccharide or oligosaccharide and a ceramide. More specifically, the endoglycoceramidase is a mutant version of a naturally occurring endoglycoceramidase, preferably comprising a mutation within the active site or the nucleophilic site of the enzyme and more preferably comprising a substitution mutation of the Glu residue within the active site or the nucleophilic site. Also disclosed are a method for generating the mutant endoglycoceramidase and a method for enzymatically synthesizing glycolipids using this mutant enzyme.

CROSS-REFERENCES TO RELATED APPLICATIONS

This application is a Continuation of U.S. application Ser. No.13/950,259, filed on Jul. 24, 2013 which is a Divisional application ofU.S. application Ser. No. 11/596,942, filed on Nov. 17, 2006 which is a371 U.S. National Phase Application from PCT/US05/019451, filed Jun. 1,2005 which claims the benefit of U.S. Provisional Application60/576,316, filed Jun. 1, 2004; U.S. Provisional Application 60/626,791,filed Nov. 10, 2004; and U.S. Provisional 60/666,765, filed Mar. 29,2005, the disclosures of each of which are incorporated herein byreference in their entirety for all purposes.

REFERENCE TO THE SEQUENCE LISTING

SEQ ID NO:1: nucleic acid sequence of a wild-type endoglycoceramidasefrom Rhodococcus sp. M-777. GenBank Accession No. U39554.

SEQ ID NO:2: amino acid sequence encoded by nucleic acid sequence ofSEQ. ID. NO.:1.

SEQ ID NO:3: amino acid sequence of a wild-type endoglycoceramidase fromRhodococcus sp. M-777. GenBank Accession No. AAB67050.

SEQ ID NO:4: nucleic acid sequence of a wild-type endoglycoceramidasefrom Rhodococcus sp. C9. GenBank Accession No. AB042327.

SEQ ID NO:5: amino acid sequence encoded by nucleic acid sequence ofSEQ. ID. NO.:4.

SEQ ID NO:6: amino acid sequence of a wild-type endoglycoceramidase fromRhodococcus sp. C9. GenBank Accession No. BAB17317.

SEQ ID NO:7: nucleic acid sequence of a wild-type endoglycoceramidasefrom Propionibacterium acnes KPA171202. GenBank Accession No.gi50839098:2281629.

SEQ ID NO:8: amino acid sequence of a wild-type endoglycoceramidase fromPropionibacterium acnes KPA171202. GenBank Accession No. YP_056771.

SEQ ID NO:9: amino acid sequence of a wild-type endoglycoceramidase fromPropionibacterium acnes KPA171202. GenBank Accession No. YP_056771.

SEQ ID NO:10: nucleic acid sequence of a wild-type endoglycoceramidasefrom Propionibacterium acnes KPA171202. GenBank Accession No.gi50839098:c709797-708223.

SEQ ID NO:11: amino acid sequence of a wild-type endoglycoceramidasefrom Propionibacterium acnes KPA171202. GenBank Accession No. YP_055358.

SEQ ID NO:12: amino acid sequence of a wild-type endoglycoceramidasefrom Propionibacterium acnes KPA171202. GenBank Accession No. YP_055358.

SEQ ID NO:13: nucleic acid sequence of a wild-type endoglycoceramidasefrom Cyanea nozakii. GenBank Accession No. AB047321.

SEQ ID NO:14: amino acid sequence of a wild-type endoglycoceramidasefrom Cyanea nozakii. GenBank Accession No. BAB16369.

SEQ ID NO:15: amino acid sequence of a wild-type endoglycoceramidasefrom Cyanea nozakii. GenBank Accession No. BAB16369.

SEQ ID NO:16: nucleic acid sequence of a wild-type endoglycoceramidasefrom Cyanea nozakii. GenBank Accession No. AB047322.

SEQ ID NO:17: amino acid sequence of a wild-type endoglycoceramidasefrom Cyanea nozakii. GenBank Accession No. BAB16370.

SEQ ID NO:18: amino acid sequence of a wild-type endoglycoceramidasefrom Cyanea nozakii. GenBank Accession No. BAB16370.

SEQ ID NO:19: nucleic acid sequence of a wild-type endoglycoceramidasefrom Hydra magnipapillata. GenBank Accession No. AB179748.

SEQ ID NO:20: amino acid sequence of a wild-type endoglycoceramidasefrom Hydra magnipapillata. GenBank Accession No. BAD20464.

SEQ ID NO:21: amino acid sequence of a wild-type endoglycoceramidasefrom Hydra magnipapillata. GenBank Accession No. BAD20464.

SEQ ID NO:22: nucleic acid sequence of a wild-type endoglycoceramidasefrom Schistosoma japonicum. GenBank Accession No. AY813337.

SEQ ID NO:23: amino acid sequence of a wild-type endoglycoceramidasefrom Schistosoma japonicum. GenBank Accession No. AAW25069.

SEQ ID NO:24: amino acid sequence of a wild-type endoglycoceramidasefrom Schistosoma japonicum. GenBank Accession No. AAW25069.

SEQ ID NO:25: amino acid sequence of a putative wild-typeendoglycoceramidase from Dictyostelium discoideum. GenBank Accession No.EAL72387.

SEQ ID NO:26: amino acid sequence of a putative wild-typeendoglycoceramidase from Streptomyces avermitilis str. MA-4680. GenBankAccession No. BAC75219.

SEQ ID NO:27: amino acid sequence of a putative wild-typeendoglycoceramidase from Leptospira interrogans serovar Copenhageni str.Fiocruz L1-130. GenBank Accession No. YP_003582.

SEQ ID NO:28: amino acid sequence of a putative wild-typeendoglycoceramidase from Neurospora crassa. GenBank Accession No.XP_331009.

SEQ ID NO:29: amino acid sequence of mutant endoglycoceramidase Aderived from AAB67050 (E233A).

SEQ ID NO:30: amino acid sequence of mutant endoglycoceramidase Aderived from AAB67050 (E233S).

SEQ ID NO:31: amino acid sequence of mutant endoglycoceramidase Aderived from AAB67050 (E233G).

SEQ ID NO:32: amino acid sequence of mutant endoglycoceramidase Aderived from AAB67050 (E233D).

SEQ ID NO:33: amino acid sequence of mutant endoglycoceramidase Aderived from AAB67050 (E233AQ).

SEQ ID NO:34: 5′ PCR primer: 5′Copt

SEQ ID NO:35: 3′ PCR primer: 3′Asp PstI

SEQ ID NO:36: 3′ PCR primer: 3′Gln PstI

SEQ ID NO:37: 3′ PCR primer: 3′ Ala PstI-11-1

SEQ ID NO:38: 3′ PCR primer: 3′ Gly PstI-11-1

SEQ ID NO:39: 3′ PCR primer: 3′ Ser PstI-11-1

SEQ ID NO:40: Rhodococcus EGC-E351A-forward primer

SEQ ID NO:41: Rhodococcus EGC-E351A-reverse primer

SEQ ID NO:42: Rhodococcus EGC-E351D-forward primer

SEQ ID NO:43: Rhodococcus EGC-E351D-reverse primer

SEQ ID NO:44: Rhodococcus EGC-E351G-forward primer

SEQ ID NO:45: Rhodococcus EGC-E351G-reverse primer

SEQ ID NO:46: Rhodococcus EGC-E351S-forward primer

SEQ ID NO:47: Rhodococcus EGC-E351S-reverse primer

SEQ ID NO:48: nucleic acid sequence encoding mutant endoglycoceramidaseHis E351S, derived from GenBank Accession No. U39554.

SEQ ID NO:49: amino acid sequence encoding mutant endoglycoceramidaseHis E351S, derived from GenBank Accession No. AAB67050.

SEQ ID NO:50: Endoglycoceramidase identifying motif A.

SEQ ID NO:51: Endoglycoceramidase identifying motif B, including theacid-base sequence region.

SEQ ID NO:52: Endoglycoceramidase identifying motif C.

SEQ ID NO:53: Endoglycoceramidase identifying motif D, including thenucleophilic glutamic acid residue.

SEQ ID NO:54: Endoglycoceramidase identifying motif E, includingnucleophilic carboxylate glutamic acid or aspartic acid residues.

SEQ ID NO:55: amino acid sequence of a mutant endoglycoceramidasederived from Rhodococcus sp. M-777. GenBank Accession No. AAB67050.X=Gly, Ala, Ser, Asp, Asn, Gln, Cys, Thr, Ile, Leu or Val.

SEQ ID NO:56: amino acid sequence of a mutant endoglycoceramidasederived from Rhodococcus sp. C9. GenBank Accession No. BAB17317. X=Gly,Ala, Ser, Asp, Asn, Gln, Cys, Thr, Ile, Leu or Val.

SEQ ID NO:57: amino acid sequence of a mutant endoglycoceramidasederived from Propionibacterium acnes KPA171202. GenBank Accession No.YP_056771. X=Gly, Ala, Ser, Asp, Asn, Gln, Cys, Thr, Ile, Leu or Val.

SEQ ID NO:58: amino acid sequence of a mutant endoglycoceramidasederived from Propionibacterium acnes KPA171202. GenBank Accession No.YP_055358. X=Gly, Ala, Ser, Asp, Asn, Gln, Cys, Thr, Ile, Leu or Val.

SEQ ID NO:59: amino acid sequence of a mutant endoglycoceramidasederived from Cyanea nozakii. GenBank Accession No. BAB16369. X=Gly, Ala,Ser, Asp, Asn, Gln, Cys, Thr, Ile, Leu or Val.

SEQ ID NO:60: amino acid sequence of a mutant endoglycoceramidasederived from Cyanea nozakii. GenBank Accession No. BAB16370. X=Gly, Ala,Ser, Asp, Asn, Gln, Cys, Thr, Ile, Leu or Val.

SEQ ID NO:61: amino acid sequence of a mutant endoglycoceramidasederived from Hydra magnipapillata. GenBank Accession No. BAD20464.X=Gly, Ala, Ser, Asp, Asn, Gln, Cys, Thr, Ile, Leu or Val.

SEQ ID NO:62: amino acid sequence of a mutant endoglycoceramidasederived from Schistosoma japonicum. GenBank Accession No. AAW25069.X=Gly, Ala, Ser, Asp, Asn, Gln, Cys, Thr, Ile, Leu or Val.

SEQ ID NO:63: amino acid sequence of a mutant endoglycoceramidasederived from Dictyostelium discoideum. GenBank Accession No. EAL72387.X=Gly, Ala, Ser, Asp, Asn, Gln, Cys, Thr, Ile, Leu or Val.

SEQ ID NO:64: amino acid sequence of a mutant endoglycoceramidasederived from Streptomyces avermitilis str. MA-4680. GenBank AccessionNo. BAC75219. X=Gly, Ala, Ser, Asp, Asn, Gln, Cys, Thr, Ile, Leu or Val.

SEQ ID NO:65: amino acid sequence of a mutant endoglycoceramidasederived from Leptospira interrogans serovar Copenhageni str. FiocruzL1-130. GenBank Accession No. YP_003582. X=Gly, Ala, Ser, Asp, Asn, Gln,Cys, Thr, Ile, Leu or Val.

SEQ ID NO:66: amino acid sequence of a mutant endoglycoceramidasederived from Neurospora crassa. GenBank Accession No. XP_331009. X=Gly,Ala, Ser, Asp, Asn, Gln, Cys, Thr, Ile, Leu or Val.

SEQ ID NO:67: predicted N-terminal signal sequence for wild-typeendoglycoceramidase from Rhodococcus sp. M-777. GenBank Accession No.AAB67050.

SEQ ID NO:68: predicted N-terminal signal sequence for wild-typeendoglycoceramidase from Rhodococcus sp. C9. GenBank Accession No.BAB17317.

SEQ ID NO:69: predicted N-terminal signal sequence for wild-typeendoglycoceramidase from Propionibacterium acnes KPA171202. GenBankAccession No. YP_056771.

SEQ ID NO:70: predicted N-terminal signal sequence for wild-typeendoglycoceramidase from Propionibacterium acnes KPA171202. GenBankAccession No. YP_055358.

SEQ ID NO:71: predicted N-terminal signal sequence for wild-typeendoglycoceramidase from Cyanea nozakii. GenBank Accession No. BAB16369and BAB16370.

SEQ ID NO:72: predicted N-terminal signal sequence for wild-typeendoglycoceramidase from Hydra magnipapillata. GenBank Accession No.BAD20464.

SEQ ID NO:73: predicted N-terminal signal sequence for wild-typeendoglycoceramidase from Schistosoma japonicum. GenBank Accession No.AAW25069.

SEQ ID NO:74: predicted N-terminal signal sequence for wild-typeendoglycoceramidase from Dictyostelium discoideum. GenBank Accession No.EAL72387.

SEQ ID NO:75: predicted N-terminal signal sequence for wild-typeendoglycoceramidase from Streptomyces avermitilis str. MA-4680. GenBankAccession No. BAC75219.

SEQ ID NO:76: predicted N-terminal signal sequence for wild-typeendoglycoceramidase from Neurospora crassa. GenBank Accession No.XP_331009.

SEQ ID NO:77: epitope tag for monoclonal anti-FLAG antibody, “FLAG tag”.

SEQ ID NO:78: DDDDK epitope tag.

SEQ ID NO:79: 6 residue histidine peptide.

SEQ ID NO:80: Polyoma middle T protein epitope tag.

SEQ ID NO:81: portion of expression vector pT7-7 with T7 promoter andtranscription start site.

SEQ ID NO:82: Synthetic Construct.

SEQ ID NO:83: portion of expression vector pT7-7 with transcriptionstart site.

FIELD OF THE INVENTION

The present invention relates to the field of synthesis of saccharides,particularly those of use in preparing glycolipids, e.g.,glycosphingolipids. More specifically, the invention relates to a novelapproach for producing a mutant endoglycoceramidase, which has asynthetic activity that can be used to catalyze the formation of theglycosidic linkage between a monosaccharide or oligosaccharide and anaglycone to form various glycolipids.

BACKGROUND OF THE INVENTION

Glycolipids, a group of amphipathic compounds that structurally consistof a sugar chain (monosaccharide or oligosaccharide) bound to anaglycone, are important cellular membrane components known toparticipate in various cellular events mediating physiological processessuch as the cell-cell recognition, antigenicity, and cell growthregulation (Hakomori, Annu. Rev. Biochem., 50: 733-764, 1981; Makita andTaniguchi, Glycolipid (Wiegandt, ed.) pp 59-82, Elsevier ScientificPublishing Co., New York, 1985). Because there are no known enzymes thatcan universally transfer a saccharyl residue to a an aglycone (e.g.,ceramide or sphingosine), synthesis of glycolipids usually requires amulti-step complex process that has the disadvantages of high cost andlow yield.

Endoglycoceramidase (EC3.2.1.123), an enzyme first isolated from theActinomycetes of Rhodococcus strain (Horibata, J. Biol. Chem. May 200410.1094/jbc.M401460200; Ito and Yamagata, J. Biol. Chem., 261:14278-14282, 1986), hydrolyzes the glycoside linkage between the sugarchain and the ceramide in glycolipids to produce intact monosaccharideor oligosaccharide and ceramide. To this date, several moreendoglycoceramidases have been isolated and characterized (see e.g., Liet al., Biochem. Biophy. Res. Comm., 149: 167-172, 1987; Ito andYamagata, J. Biol. Chem., 264: 9510-9519, 1989; Zhou et al., J. Biol.Chem., 264: 12272-12277, 1989; Ashida et al., Eur. J. Biochem., 205:729-735, 1992; Izu et al., J. Biol. Chem., 272: 19846-19850, 1997;Horibata et al., J. Biol. Chem., 275:31297-31304, 2000; Sakaguchi etal., J. Biochem., 128: 145-152, 2000; and U.S. Pat. No. 5,795,765). Theactive site of endoglycoceramidases has also been described by Sakaguchiet al., Biochem. Biophy. Res. Comm., 260: 89-93, 1999, as including athree amino acid segment of Asn-Glu-Pro, among which the Glu residueappears to be the most important to the enzymatic activity.

Endoglycoceramidases are also known to possess an additionaltransglycosylation activity, which is much weaker than the hydrolyticactivity (Li et al., J. Biol. Chem., 266:10723-10726, 1991; Ashida etal., Arch. Biochem. Biophy., 305:559-562, 1993; Horibata et al., J.Biochem., 130:263-268, 2001). This transglycosylation activity has notyet been exploited to synthesize glycolipids, because the far morepotent hydrolytic activity of the enzyme counteracts this syntheticactivity by quickly hydrolyzing newly made glycolipid.

In view of the deficiencies of the current methods for chemicallysynthesizing glycosphigolipids, a method that relies on the substratespecificity of a synthetic endoglycoceramidase would represent asignificant advance in the field of saccharide (glycolipid) synthesis.The present invention provides such a synthetic endoglycoceramidase(“endoglycoceramide synthase”) and methods for using this new enzyme.

BRIEF SUMMARY OF THE INVENTION

The present invention provides mutant endoglycoceramidase enzymes thathave synthetic activity, assembling a saccharide and an aglycone, e.g.,a ceramide or sphingosine, to form a glycolipid or a component thereof.The enzymes of the invention exploit the exquisite selectivity ofenzymatic reactions to simplify the synthesis of glycolipids.

In a first aspect, the invention provides a mutant endoglycoceramidasehaving a modified nucleophilic carboxylate (i.e., Glu or Asp) residue,wherein the nucleophilic carboxylate residue resides within a(Ile/Met/Leu/Phe/Val)-(Leu/Met/Ile/Val)-(Gly/Ser/Thr)-(Glu/Asp)-(Phe/Thr/Met/Leu)-(Gly/Leu/Phe)sequence (SEQ ID NO:54 or motif E), or conservative variants thereof, ofa corresponding wild-type endoglycoceramidase, wherein the mutantendoglycoceramidase catalyzes the transfer of a saccharide moiety from adonor substrate to an acceptor substrate (e.g., an aglycone). Typically,the Glu/Asp residue is substituted with an amino acid residue other thana Glu/Asp residue, for example, a Gly, Ala, Ser, Asp, Asn, Gln, Cys,Thr, Ile, Leu or Val. In certain embodiments, the mutantendoglycoceramidase comprises any one of an amino acid sequence of SEQID NOs:55-66.

In a related aspect, the invention provides a mutant endoglycoceramidasecharacterized in that

-   -   i) in its native form the endoglycoceramidase comprises an amino        acid sequence that is any one of SEQ. ID. NO.s: 2 (Rhodococcus)        and the polypetide is encoded with nucleic acid sequence SEQ.        ID. NO.: 1), 4 (Rhodococcus, SEQ. ID. NO.: 4 is describing the        nucleic acid and the polypetide sequence), 6 (Propionibacterium        acnes), 8 (Propionibacterium acnes), 10 (Cyanea nozakii), 12        (Cyanea nozakii), 14 (Hydra magnipapillata), 16 (Schistosoma        japonicum), 17 (Dictyostelium discoideum), 18 (Streptomyces        avermitilis), 19 (Leptospira interrogans), and 20 (Neurospora        crassa); and    -   ii) the nucleophilic carboxylate (i.e., Glu or Asp) residue        within a        (Ile/Met/Leu/Phe/Val)-(Leu/Met/Ile/Val)-(Gly/Ser/Thr)-(Glu/Asp)-(Phe/Thr/Met/Leu)-(Gly/Leu/Phe)        sequence (SEQ ID NO:54) of a corresponding wild-type        endoglycoceramidase is modified to an amino acid other than        Glu/Asp.

In another aspect, the invention provides a method for making a mutantendoglycoceramidase having enhanced synthetic activity in comparison toa corresponding wild-type endoglycoceramidase, the method comprisingmodifying the nucleophilic carboxylate (i.e., Glu or Asp) residue in acorresponding wild-type endoglycoceramidase, wherein the nucleophilicGlu/Asp resides within a(Ile/Met/Leu/Phe/Val)-(Leu/Met/Ile/Val)-(Gly/Ser/Thr)-(Glu/Asp)-(Phe/Thr/Met/Leu)-(Gly/Leu/Phe)sequence (SEQ ID NO:54) of a corresponding wild-typeendoglycoceramidase.

In another aspect, the invention provides a method of synthesizing aglycolipid or an aglycone, the method comprising, contacting a donorsubstrate comprising a saccharide moiety and an acceptor substrate witha mutant endoglycoceramidase having a modified nucleophilic carboxylateresidue (i.e., Glu or Asp), wherein the nucleophilic Glu/Asp resideswithin a(Ile/Met/Leu/Phe/Val)-(Leu/Met/Ile/Val)-(Gly/Ser/Thr)-(Glu/Asp)-(Phe/Thr/Met/Leu)-(Gly/Leu/Phe)sequence (SEQ ID NO:54 or motif E) of a corresponding wild-typeendoglycoceramidase, under conditions wherein the endoglycoceramidasecatalyzes the transfer of a saccharide moiety from a donor substrate toan acceptor substrate, thereby producing the glycolipid or aglycone.

In a further aspect the invention provides expression vectors thatcomprise mutant endoglycoceramidase polynucleotide sequences; host cellsthat comprise the expression vectors, and methods of making the mutantendoglycoceramidase polypeptides described herein, by growing the hostcells under conditions suitable for expression of the mutantendoglycoceramidase polypeptide.

Other objects, aspects and advantages of the invention will be apparentfrom the detailed description that follows.

Definitions

A “glycolipid” is a covalent conjugate between a glycosyl moiety and asubstrate for a mutant endoglycoceramidase of the invention, such as anaglycone. An exemplary “glycolipid” is a covalent conjugate, between aglycosyl moiety and an aglycone, formed by a mutant endoglycoceramidaseof the invention. The term “glycolipid” encompasses allglycosphingolipids, which are a group of amphipathic compounds thatstructurally consist of a sugar chain moiety (monosaccharide,oligosaccharide, or derivatives thereof) and an aglycone (i.e., aceramide, a sphingosine, or a sphingosine analog). This term encompassesboth cerebrosides and gangliosides. In certain embodiments, a glycolipidis an aglycone (non-carbohydrate alcohol (OH) or (SH)) conjugated to anon-reducing sugar and a non-glycoside.

An “aglycone,” as referred to herein, is an acceptor substrate ontowhich a mutant endoglycoceramidase of the invention transfers glycosylmoiety from a glycosyl donor that is a substrate for said glycosyldonor. A glycosyl donor may be an activated or non-activated saccharide.An exemplary aglycone is a heteroalkyl moiety, which has the structureof, e.g., Formula Ia, Formula Ib or Formula II as shown below:

In Formula Ia and Formula Ib, the symbol Z represents OH, SH, orNR⁴R^(4′). R¹ and R² are members independently selected from NHR⁴, SR⁴,OR⁴, OCOR⁴, OC(O)NHR⁴, NHC(O)OR⁴, OS(O)₂OR⁴, C(O)R⁴, NHC(O)R⁴,detectable labels, and targeting moieties. The symbols R³, R⁴ andR^(4′), R⁵, R⁶ and R⁷ each are members independently selected from H,substituted or unsubstituted alkyl, substituted or unsubstitutedheteroalkyl, substituted or unsubstituted aryl, substituted orunsubstituted heteroaryl, substituted or unsubstituted heterocycloalkyl.

In Formula II, Z¹ is a member selected from O, S, and NR⁴; R¹ and R² aremembers independently selected from NHR⁴, SR⁴, OR⁴, OCOR⁴, OC(O)NHR⁴,NHC(O)OR⁴, OS(O)₂OR⁴, C(O)R⁴, NHC(O)R⁴, detectable labels, and targetingmoieties. The symbols R³, R⁴, R⁵, R⁶ and R⁷ each are membersindependently selected from H, substituted or unsubstituted alkyl,substituted or unsubstituted heteroalkyl, substituted or unsubstitutedaryl, substituted or unsubstituted heteroaryl, substituted orunsubstituted heterocycloalkyl. Formula II is representative of certainembodiments wherein the aglycone portion is conjugated to a furthersubstrate component, for example, a leaving group or a solid support.

The following abbreviations are used herein:

-   -   Ara=arabinosyl;    -   Cer=ceramide    -   Fru=fructosyl;    -   Fuc=fucosyl;    -   Gal=galactosyl;    -   GalNAc=N-acetylgalactosaminyl;    -   Glc=glucosyl;    -   GlcNAc=N-acetylglucosaminyl;    -   Man=mannosyl; and    -   NeuAc=sialyl (N-acetylneuraminyl).

The term “sialic acid” or “sialic acid moiety” refers to any member of afamily of nine-carbon carboxylated sugars. The most common member of thesialic acid family is N-acetyl-neuraminic acid(2-keto-5-acetamido-3,5-dideoxy-D-glycero-D-galactononulopyranos-1-onicacid (often abbreviated as Neu5Ac, NeuAc, or NANA). A second member ofthe family is N-glycolyl-neuraminic acid (Neu5Gc or NeuGc), in which theN-acetyl group of NeuAc is hydroxylated. A third sialic acid familymember is 2-keto-3-deoxy-nonulosonic acid (KDN) (Nadano et al. (1986) J.Biol. Chem. 261: 11550-11557; Kanamori et al., J. Biol. Chem. 265:21811-21819 (1990)). Also included are 9-substituted sialic acids suchas a 9-O—C₁-C₆ acyl-Neu5Ac like 9-O-lactyl-Neu5Ac or 9-O-acetyl-Neu5Ac,9-deoxy-9-fluoro-Neu5Ac and 9-azido-9-deoxy-Neu5Ac. For review of thesialic acid family, see, e.g., Varki, Glycobiology 2: 25-40 (1992);Sialic Acids: Chemistry, Metabolism and Function, R. Schauer, Ed.(Springer-Verlag, New York (1992)). The synthesis and use of sialic acidcompounds in a sialylation procedure is disclosed in internationalapplication WO 92/16640, published Oct. 1, 1992.

The term “ceramide,” as used herein, encompasses all ceramides andsphingosine as conventionally defined. See, for example, Berg, et al,Biochemistry, 2002, 5th ed., W.H. Freeman and Co.

The term “sphingosine analog” refers to lipid moieties that arechemically similar to sphingosine, but are modified at the polar headand/or the hydrophobic carbon chain. Sphingolipid analog moieties usefulas acceptor substrates in the present methods include, but are notlimited to, those described in co-pending patent applicationsPCT/US2004/006904 (which claims priority to U.S. Provisional PatentApplication No. 60/452,796); U.S. patent application Ser. No.10/487,841; U.S. patent application Ser. Nos. 10/485,892; 10/485,195,and 60/626,678, the disclosures of each of which are hereby incorporatedherein by reference in their entirety for all purposes.

In general, the sphingosine analogs described in the above-referencedapplications are those compounds having the formula:

wherein Z is a member selected from O, S, C(R²)₂ and NR²; X is a memberselected from H, —OR³, —NR³R⁴, —SR³, and —CHR³R⁴; R¹, R², R³ and R⁴ aremembers independently selected from H, substituted or unsubstitutedalkyl, substituted or unsubstituted heteroalkyl, substituted orunsubstituted aryl, substituted or unsubstituted heteroaryl, substitutedor unsubstituted heterocycloalkyl, —C(=M)R⁵, —C(=M)-Z—R⁵, —SO₂R⁵, and—SO₃; wherein M and Z¹ are members independently selected from O, NR⁶ orS; Y is a member selected from H, —OR⁷, —SR⁷, —NR⁷R⁸, substituted orunsubstituted alkyl, substituted or unsubstituted heteroalkyl,substituted or unsubstituted aryl, substituted or unsubstitutedheteroaryl, and substituted or unsubstituted heterocycloalkyl, whereinR⁵, R⁶, R⁷ and R⁸ are members independently selected from H, substitutedor unsubstituted alkyl, substituted or unsubstituted heteroalkyl,substituted or unsubstituted aryl, substituted or unsubstitutedheteroaryl, substituted or unsubstituted heterocycloalkyl; and R^(a),R^(b), R^(c) and R^(d) are each independently H, substituted orunsubstituted alkyl, substituted or unsubstituted heteroalkyl,substituted or unsubstituted aryl, substituted or unsubstitutedheteroaryl, substituted or unsubstituted heterocycloalkyl.

An “acceptor substrate” for a wild-type endoglycoceramidase or a mutantendoglycoceramidase, is any aglycone moiety that can act as an acceptorfor a particular endoglycoceramidase. When the acceptor substrate iscontacted with the corresponding endoglycoceramidase and sugar donorsubstrate, and other necessary reaction mixture components, and thereaction mixture is incubated for a sufficient period of time, theendoglycoceramidase transfers sugar residues from the sugar donorsubstrate to the acceptor substrate. The acceptor substrate can vary fordifferent types of a particular endoglycoceramidase. Accordingly, theterm “acceptor substrate” is taken in context with the particularendoglycoceramidase or mutant endoglycoceramidase of interest for aparticular application. Acceptor substrates for endoglycoceramidases andmutant endoglycoceramidases are described herein.

A “donor substrate” for wild-type and mutant endoglycoceramidasesincludes any activated glycosyl derivatives of anomeric configurationopposite the natural glycosidic linkage. The enzymes of the inventionare used to couple α-modified or β-modified glycosyl donors, usuallyα-modified glycosyl donors, with glycoside acceptors. Preferred donormolecules are glycosyl fluorides, although donors with other groupswhich are reasonably small and which function as relatively good leavinggroups can also be used. Examples of other glycosyl donor moleculesinclude glycosyl chlorides, bromides, acetates, mesylates, propionates,pivaloates, and glycosyl molecules modified with substituted phenols.Among the α-modified or β-modified glycosyl donors, α-galactosyl,α-mannosyl, α-glucosyl, α-fucosyl, α-xylosyl, α-sialyl,α-N-acetylglucosaminyl, α-N-acetylgalactosaminyl, β-galactosyl,β-mannosyl, β-glucosyl, β-fucosyl, β-xylosyl, β-sialyl,β-N-acetylglucosaminyl and β-N-acetylgalactosaminyl are most preferred.The donor molecules can be monosaccharides, or may themselves containmultiple sugar moieties (oligosaccharides). Donor substrates of use inthe particular methods include those described in U.S. Pat. Nos.6,284,494; 6,204,029; 5,952,203; and 5,716,812, the disclosures of whichare hereby incorporated herein by reference in their entirety for allpurposes.

The term “contacting” is used herein interchangeably with the following:combined with, added to, mixed with, passed over, incubated with, flowedover, etc.

“Endoglycoceramidase,” as used herein, refers to an enzyme that in itsnative or wild-type version has a primary activity of cleaving theglycosidic linkage between a monosaccharide or an oligosaccharide and aceramide (or sphingosine) of an acidic or neutral glycolipid, producingintact monosaccharide or oligosaccharide and ceramide (Registry number:EC 3.2.1.123). The wild-type version of this enzyme may also have asecondary activity of catalyzing the formation of the glycosidic linkagebetween a monosaccharide or oligosaccharide and an aglycone (i.e., aceramide or a sphingosine) to form various glycolipids. Wild-typeendoglycoceramidases have at least two identifiable conserved motifs,including an acid-base region(Val-X₁-(Ala/Gly)-(Tyr/Phe)-(Asp/Glu)-(Leu/Ile)-X₂-Asn-Glu-Pro-X₃-X₄-Glyor motif B or SEQ ID NO:51), and a nucleophilic region((Ile/Met/Leu/Phe/Val)-(Leu/Met/Ile/Val)-(Gly/Ser/Thr)-Glu-(Phe/Thr/Met/Leu)-(Gly/Leu/Pheor motif D or SEQ ID NO:53).

The terms “mutated” or “modified” as used in the context of altering thestructure or enzymatic activity of a wild-type endoglycoceramidase,refers to the deletion, insertion, or substitution of any nucleotide oramino acid residue, by chemical, enzymatic, or any other means, in apolynucleotide sequence encoding an endoglycoceramidase or the aminoacid sequence of a wild-type endoglycoceramidase, respectively, suchthat the amino acid sequence of the resulting endoglycoceramidase isaltered at one or more amino acid residues. The site for such anactivity-altering mutation may be located anywhere in the enzyme,including within the active site of the endoglycoceramidase,particularly involving the glutamic acid residue of the Asn-Glu-Prosubsequence of the acid-base sequence region. An artisan of ordinaryskill will readily locate this Glu residue, for example, at position 233in SEQ ID NO:3 and at position 224 in SEQ ID NO:6. Other examples of Gluresidues that, once mutated, can alter the enzymatic activity of anendoglycoceramidase include a carboxylate (i.e., Glu or Asp)nucleophilic Glu/Asp residue (bolded) in the(Ile/Met/Leu/Phe/Val)-(Leu/Met/Ile/Val)-(Gly/Ser/Thr)-Glu/Asp-(Phe/Thr/Met/Leu)-(Gly/Leu/Phe)motif of a corresponding wild-type endoglycoceramidase.

A “mutant endoglycoceramidase” or “modified endoglycoceramidase” of thisinvention thus comprises at least one mutated or modified amino acidresidue. On the other hand, the wild-type endoglycoceramidase whosecoding sequence is modified to generate a mutant endoglycoceramidase isreferred to in this application as “the corresponding native orwild-type endoglycoceramidase.” One exemplary mutant endoglycoceramidaseof the invention includes the deletion or substitution of a nucleophiliccarboxylate Glu/Asp residue (bolded) in the(Ile/Met/Leu/Phe/Val)-(Leu/Met/Ile/Val)-(Gly/Ser/Thr)-Glu/Asp-(Phe/Thr/Met/Leu)-(Gly/Leu/Phe)motif of a corresponding wild-type endoglycoceramidase. One exemplarymutant endoglycoceramidase of the invention includes a mutation withinthe active site, e.g., the deletion or substitution of the Glu residuewithin the Asn-Glu-Pro subsequence of the acid-base sequence region. Themutant endoglycoceramidase exhibits an altered enzymatic activity, e.g.,an enhanced glycolipid synthetic activity, in comparison with itswild-type counterpart. A mutant endoglycoceramidase that hasdemonstrated an increased glycolipid synthetic activity is also calledan “endoglycoceramide synthase.”

The term “acid-base sequence region” refers to a conservedVal-X₁-(Ala/Gly)-(Tyr/Phe)-(Asp/Glu)-(Leu/Ile)-X₂-Asn-Glu-Pro-X₃-X₄-Glysequence (SEQ ID NO:51) in a corresponding wild-type endoglycoceramidasewhich includes a conserved Asn-Glu-Pro subsequence. The acid-baseglutamic acid residue is located within the conserved Asn-Glu-Prosubsequence, for example, at position 233 in Rhodococcus sp. M-777;position 224 in Rhodococcus sp. C9; position 229 in Propionibacteriumacnes EGCa; position 248 in Propionibacterium acnes EGCb; position 238in Cyanea nozakii; at position 229 in Hydra magnipapillata; at position234 in Dictyostelium; at position 214 in Schistosoma; at position 241 inLeptospira interrogans; at position 272 of Streptomyces; and at position247 of Neurosporassa (see, FIG. 15). The conserved sequence encoding athree-amino acid segment Asn-Glu-Pro was previously identified withinthe active site of endoglycoceramidases, and the Glu residue within thesegment was thought to be connected to the hydrolytic activity of theendoglycoceramidase (Sakaguchi et al., Biochem. Biophys. Res. Commun.,1999, 260: 89-93).

The term “nucleophilic residue” or “nucleophilic motif” refers to thecarboxylate amino acid residue within the(Ile/Met/Leu/Phe/Val)-(Leu/Met/Ile/Val)-(Gly/Ser/Thr)-(Asp/Glu)-(Phe/Thr/Met/Leu)-(Gly/Leu/Phe)motif (SEQ ID NO:54) of a corresponding wild-type endoglycoceramidase.The nucleophilic residue can be a glutamate or an aspartate, usually aglutamate. A nucleophilic glutamic acid residue is located, for example,at position 351 in Rhodococcus sp. M-777; position 343 in Rhodococcussp. C9; position 342 in Propionibacterium acnes EGCa; position 360 inPropionibacterium acnes EGCb; position 361 in Cyanea nozakii; and atposition 349 in Hydra magnipapillata; at position 354 in Dictyostelium;at position 351 in Schistosoma; at position 461 in Leptospirainterrogans; at position 391 of Streptomyces; and at position 498 ofNeurosporassa (see, FIG. 15).

The recombinant fusion proteins of the invention can be constructed andexpressed as a fusion protein with a molecular “purification tag” at oneend, which facilitates purification of the protein. Such tags can alsobe used for immobilization of a protein of interest during theglycolipid synthesis reaction. Exemplified purification tags includeMalE, 6 or more sequential histidine residues, cellulose bindingprotein, maltose binding protein (malE), glutathione S-transferase(GST), lactoferrin, and Sumo fusion protein cleavable sequences(commercially available from LifeSensors, Malvern, Pa. and EMDBiosciences). Suitable tags include “epitope tags,” which are a proteinsequence that is specifically recognized by an antibody. Epitope tagsare generally incorporated into fusion proteins to enable the use of areadily available antibody to unambiguously detect or isolate the fusionprotein. A “FLAG tag” is a commonly used epitope tag, specificallyrecognized by a monoclonal anti-FLAG antibody, consisting of thesequence AspTyrLysAspAspAsp AspLys or a substantially identical variantthereof. Other epitope tags that can be used in the invention include,e.g., myc tag, AU1, AU5, DDDDK (EC5), E tag, E2 tag, Glu-Glu, a 6residue peptide, EYMPME, derived from the Polyoma middle T protein, HA,HSV, IRS, KT3, S tage, S1 tag, T7 tag, V5 tag, VSV-G, β-galactosidase,Gal4, green fluorescent protein (GFP), luciferase, protein C, protein A,cellulose binding protein, GST (glutathione S-transferase), a step-tag,Nus-S, PPI-ases, Pfg 27, calmodulin binding protein, dsb A and fragmentsthereof, and granzyme B. Epitope peptides and antibodies that bindspecifically to epitope sequences are commercially available from, e.g.,Covance Research Products, Inc.; Bethyl Laboratories, Inc.; Abcam Ltd.;and Novus Biologicals, Inc.

The term “nucleic acid” or “polynucleotide” refers to deoxyribonucleicacids (DNA) or ribonucleic acids (RNA) and polymers thereof in eithersingle- or double-stranded form. Unless specifically limited, the termencompasses nucleic acids containing known analogues of naturalnucleotides that have similar binding properties as the referencenucleic acid and are metabolized in a manner similar to naturallyoccurring nucleotides. Unless otherwise indicated, a particular nucleicacid sequence also implicitly encompasses conservatively modifiedvariants thereof (e.g., degenerate codon substitutions), alleles,orthologs, SNPs, and complementary sequences as well as the sequenceexplicitly indicated. Specifically, degenerate codon substitutions maybe achieved by generating sequences in which the third position of oneor more selected (or all) codons is substituted with mixed-base and/ordeoxyinosine residues (Batzer et al., Nucleic Acid Res. 19:5081 (1991);Ohtsuka et al., J. Biol. Chem. 260:2605-2608 (1985); and Rossolini etal., Mol. Cell. Probes 8:91-98 (1994)). The term nucleic acid is usedinterchangeably with gene, cDNA, and mRNA encoded by a gene.

The term “gene” means the segment of DNA involved in producing apolypeptide chain. It may include regions preceding and following thecoding region (leader and trailer) as well as intervening sequences(introns) between individual coding segments (exons).

The term “operably linked” refers to functional linkage between anucleic acid expression control sequence (such as a promoter, signalsequence, or array of transcription factor binding sites) and a secondnucleic acid sequence, wherein the expression control sequence affectstranscription and/or translation of the nucleic acid corresponding tothe second sequence.

A “recombinant expression cassette” or simply an “expression cassette”is a nucleic acid construct, generated recombinantly or synthetically,with nucleic acid elements that are capable of affecting expression of astructural gene in hosts compatible with such sequences. Expressioncassettes include at least promoters and optionally, transcriptiontermination signals. Typically, the recombinant expression cassetteincludes a nucleic acid to be transcribed (e.g., a nucleic acid encodinga desired polypeptide), and a promoter. Additional factors necessary orhelpful in effecting expression may also be used as described herein.For example, an expression cassette can also include nucleotidesequences that encode a signal sequence that directs secretion of anexpressed protein from the host cell. Transcription termination signals,enhancers, and other nucleic acid sequences that influence geneexpression, can also be included in an expression cassette.

The term “amino acid” refers to naturally occurring and synthetic aminoacids, as well as amino acid analogs and amino acid mimetics thatfunction in a manner similar to the naturally occurring amino acids.Naturally occurring amino acids are those encoded by the genetic code,as well as those amino acids that are later modified, e.g.,hydroxyproline, γ-carboxyglutamate, and O-phosphoserine.

“Amino acid analogs” refers to compounds that have the same basicchemical structure as a naturally occurring amino acid, i.e., an acarbon that is bound to a hydrogen, a carboxyl group, an amino group,and an R group, e.g., homoserine, norleucine, methionine sulfoxide,methionine methyl sulfonium. Such analogs have modified R groups (e.g.,norleucine) or modified peptide backbones, but retain the same basicchemical structure as a naturally occurring amino acid.

“Unnatural amino acids” are not encoded by the genetic code and can, butdo not necessarily have the same basic structure as a naturallyoccurring amino acid. Unnatureal amino acids include, but are notlimited to azetidinecarboxylic acid, 2-aminoadipic acid, 3-aminoadipicacid, beta-alanine, aminopropionic acid, 2-aminobutyric acid,4-aminobutyric acid, 6-aminocaproic acid, 2-aminoheptanoic acid,2-aminoisobutyric acid, 3-aminoisbutyric acid, 2-aminopimelic acid,tertiary-butylglycine, 2,4-diaminoisobutyric acid, desmosine,2,2′-diaminopimelic acid, 2,3-diaminopropionic acid, N-ethylglycine,N-ethylasparagine, homoproline, hydroxylysine, allo-hydroxylysine,3-hydroxyproline, 4-hydroxyproline, isodesmosine, allo-isoleucine,N-methylalanine, N-methylglycine, N-methylisoleucine,N-methylpentylglycine, N-methylvaline, naphthalanine, norvaline,ornithine, pentylglycine, pipecolic acid and thioproline.

“Amino acid mimetics” refers to chemical compounds that have a structurethat is different from the general chemical structure of an amino acid,but that functions in a manner similar to a naturally occurring aminoacid.

Amino acids may be referred to herein by either the commonly known threeletter symbols or by the one-letter symbols recommended by the IUPAC-IUBBiochemical Nomenclature Commission. Nucleotides, likewise, may bereferred to by their commonly accepted single-letter codes.

“Conservatively modified variants” applies to both amino acid andnucleic acid sequences. With respect to particular nucleic acidsequences, “conservatively modified variants” refers to those nucleicacids that encode identical or essentially identical amino acidsequences, or where the nucleic acid does not encode an amino acidsequence, to essentially identical sequences. Because of the degeneracyof the genetic code, a large number of functionally identical nucleicacids encode any given protein. For instance, the codons GCA, GCC, GCGand GCU all encode the amino acid alanine. Thus, at every position wherean alanine is specified by a codon, the codon can be altered to any ofthe corresponding codons described without altering the encodedpolypeptide. Such nucleic acid variations are “silent variations,” whichare one species of conservatively modified variations. Every nucleicacid sequence herein that encodes a polypeptide also describes everypossible silent variation of the nucleic acid. One of skill willrecognize that each codon in a nucleic acid (except AUG, which isordinarily the only codon for methionine, and TGG, which is ordinarilythe only codon for tryptophan) can be modified to yield a functionallyidentical molecule. Accordingly, each silent variation of a nucleic acidthat encodes a polypeptide is implicit in each described sequence.

As to amino acid sequences, one of skill will recognize that individualsubstitutions, deletions or additions to a nucleic acid, peptide,polypeptide, or protein sequence which alters, adds or deletes a singleamino acid or a small percentage of amino acids in the encoded sequenceis a “conservatively modified variant” where the alteration results inthe substitution of an amino acid with a chemically similar amino acid(i.e., hydrophobic, hydrophilic, positively charged, neutral, negativelycharged). Exemplified hydrophobic amino acids include valine, leucine,isoleucine, methionine, phenylalanine, and tryptophan. Exemplifiedaromatic amino acids include phenylalanine, tyrosine and tryptophan.Exemplified aliphatic amino acids include serine and threonine.Exemplified basic aminoacids include lysine, arginine and histidine.Exemplified amino acids with carboxylate side-chains include aspartateand glutamate. Exemplified amino acids with carboxamide side chainsinclude asparagines and glutamine. Conservative substitution tablesproviding functionally similar amino acids are well known in the art.Such conservatively modified variants are in addition to and do notexclude polymorphic variants, interspecies homologs, and alleles of theinvention.

The following eight groups each contain amino acids that areconservative substitutions for one another:

1) Alanine (A), Glycine (G);

2) Aspartic acid (D), Glutamic acid (E);

3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5)Isoleucine (I), Leucine (L), Methionine (M), Valine (V); 6)Phenylalanine (F), Tyrosine (Y), Tryptophan (W); 7) Serine (S),Threonine (T); and 8) Cysteine (C), Methionine (M)

(see, e.g., Creighton, Proteins (1984)).

Amino acids may be referred to herein by either their commonly knownthree letter symbols or by the one-letter symbols recommended by theIUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise,may be referred to by their commonly accepted single-letter codes.

“Polypeptide,” “peptide,” and “protein” are used interchangeably hereinto refer to a polymer of amino acid residues. All three terms apply toamino acid polymers in which one or more amino acid residue is anartificial chemical mimetic of a corresponding naturally occurring aminoacid, as well as to naturally occurring amino acid polymers andnon-naturally occurring amino acid polymers. As used herein, the termsencompass amino acid chains of any length, including full-lengthproteins, wherein the amino acid residues are linked by covalent peptidebonds.

A “heterologous polynucleotide,” “heterologous nucleic acid”, or“heterologous polypeptide,” as used herein, is one that originates froma source foreign to the particular host cell, or, if from the samesource, is modified from its original form. Thus, a heterologousendoglycoceramidase gene in a prokaryotic host cell includes aendoglycoceramidase gene that is endogenous to the particular host cellbut has been modified. Modification of the heterologous sequence mayoccur, e.g., by treating the DNA with a restriction enzyme to generate aDNA fragment that is capable of being operably linked to a promoter.Techniques such as site-directed mutagenesis are also useful formodifying a heterologous sequence.

A “subsequence” refers to a sequence of nucleic acids or amino acidsthat comprise a part of a longer sequence of nucleic acids or aminoacids (e.g., polypeptide) respectively.

The terms “identical” or percent “identity,” in the context of two ormore nucleic acids or polypeptide sequences, refer to two or moresequences or subsequences that are the same or have a specifiedpercentage of amino acid residues or nucleotides that are the same(i.e., about 60% identity, preferably 65%, 70%, 75%, 80%, 85%, 90%, 91%,92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or higher identity over aspecified region, for example over a region of at least about 25, 50,75, 100, 150, 200, 250, 500, 1000, or more nucleic acids or amino acids,up to the full length sequence, when compared and aligned for maximumcorrespondence over a comparison window or designated region) asmeasured using a BLAST or BLAST 2.0 sequence comparison algorithms withdefault parameters described below, or by manual alignment and visualinspection (see, e.g., NCBI web site http://www.ncbi.nlm.nih.gov/BLAST/or the like). Such sequences are then said to be “substantiallyidentical.” This definition also refers to, or may be applied to, thecompliment of a test sequence. The definition also includes sequencesthat have deletions and/or additions, as well as those that havesubstitutions. As described below, the preferred algorithms can accountfor gaps and the like. Preferably, identity exists over a region that isat least about 25 amino acids or nucleotides in length, or morepreferably over a region that is 50-100 amino acids or nucleotides inlength.

For sequence comparison, typically one sequence acts as a referencesequence, to which test sequences are compared. When using a sequencecomparison algorithm, test and reference sequences are entered into acomputer, subsequence coordinates are designated, if necessary, andsequence algorithm program parameters are designated. Preferably,default program parameters can be used, or alternative parameters can bedesignated. The sequence comparison algorithm then calculates thepercent sequence identities for the test sequences relative to thereference sequence, based on the program parameters.

A “comparison window”, as used herein, includes reference to a segmentof any one of the number of contiguous positions selected from the groupconsisting of from 20 to 600, usually about 50 to about 200, moreusually about 100 to about 150 in which a sequence may be compared to areference sequence of the same number of contiguous positions after thetwo sequences are optimally aligned. Methods of alignment of sequencesfor comparison are well-known in the art. Optimal alignment of sequencesfor comparison can be conducted, e.g., by the local homology algorithmof Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homologyalignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970),by the search for similarity method of Pearson & Lipman, Proc. Nat'l.Acad. Sci. USA 85:2444 (1988), by computerized implementations of thesealgorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin GeneticsSoftware Package, Genetics Computer Group, 575 Science Dr., Madison,Wis.), or by manual alignment and visual inspection (see, e.g., CurrentProtocols in Molecular Biology (Ausubel et al., eds. 1995 supplement)).

A preferred example of algorithm that is suitable for determiningpercent sequence identity and sequence similarity are the BLAST andBLAST 2.0 algorithms, which are described in Altschul et al., Nuc. AcidsRes. 25:3389-3402 (1977) and Altschul et al., J. Mol. Biol. 215:403-410(1990), respectively. BLAST and BLAST 2.0 are used, with the parametersdescribed herein, to determine percent sequence identity for the nucleicacids and proteins of the invention. Software for performing BLASTanalyses is publicly available through the National Center forBiotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithminvolves first identifying high scoring sequence pairs (HSPs) byidentifying short words of length W in the query sequence, which eithermatch or satisfy some positive-valued threshold score T when alignedwith a word of the same length in a database sequence. T is referred toas the neighborhood word score threshold (Altschul et al., supra). Theseinitial neighborhood word hits act as seeds for initiating searches tofind longer HSPs containing them. The word hits are extended in bothdirections along each sequence for as far as the cumulative alignmentscore can be increased. Cumulative scores are calculated using, fornucleotide sequences, the parameters M (reward score for a pair ofmatching residues; always >0) and N (penalty score for mismatchingresidues; always <0). For amino acid sequences, a scoring matrix is usedto calculate the cumulative score. Extension of the word hits in eachdirection are halted when: the cumulative alignment score falls off bythe quantity X from its maximum achieved value; the cumulative scoregoes to zero or below, due to the accumulation of one or morenegative-scoring residue alignments; or the end of either sequence isreached. The BLAST algorithm parameters W, T, and X determine thesensitivity and speed of the alignment. The BLASTN program (fornucleotide sequences) uses as defaults a wordlength (W) of 11, anexpectation (E) of 10, M=5, N=−4 and a comparison of both strands. Foramino acid sequences, the BLASTP program uses as defaults a wordlengthof 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (seeHenikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989))alignments (B) of 50, expectation (E) of 10, M=5, N=−4, and a comparisonof both strands.

The phrase “stringent hybridization conditions” refers to conditionsunder which a nucleic acid will hybridize to its target subsequence,typically in a complex mixture of nucleic acids, but to no othersequences. Stringent conditions are sequence-dependent and will bedifferent in different circumstances. Longer sequences hybridizespecifically at higher temperatures. An extensive guide to thehybridization of nucleic acids is found in Tijssen, Techniques inBiochemistry and Molecular Biology—Hybridization with Nucleic Probes,“Overview of principles of hybridization and the strategy of nucleicacid assays” (1993). Generally, stringent conditions are selected to beabout 5-10° C. lower than the thermal melting point (Tm) for thespecific sequence at a defined ionic strength pH. The Tm is thetemperature (under defined ionic strength, pH, and nucleicconcentration) at which 50% of the probes complementary to the targethybridize to the target sequence at equilibrium (as the target sequencesare present in excess, at Tm, 50% of the probes are occupied atequilibrium). Stringent conditions will be those in which the saltconcentration is less than about 1.0 M sodium ion, typically about 0.01to 1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 andthe temperature is at least about 30° C. for short nucleic acidsequences (e.g., 10 to 50 nucleotides) and at least about 60° C. forlong nucleic acid sequences (e.g., greater than 50 nucleotides).Stringent conditions may also be achieved with the addition ofdestabilizing agents such as formamide. For selective or specifichybridization, a positive signal is at least two times background,preferably 10 times background hybridization. Exemplary “highlystringent” hybridization conditions include hybridization in a buffercomprising 50% formamide, 5×SSC, and 1% SDS at 42° C., or hybridizationin a buffer comprising 5×SSC and 1% SDS at 65° C., both with a wash of0.2×SSC and 0.1% SDS at 65° C. Exemplary “moderately stringenthybridization conditions” include a hybridization in a buffer of 40%formamide, 1 M NaCl, and 1% SDS at 37° C., and a wash in 1×SSC at 45° C.

The term “alkyl,” by itself or as part of another substituent, means,unless otherwise stated, a straight or branched chain, or cyclichydrocarbon radical, or combination thereof, which may be fullysaturated, mono- or polyunsaturated and can include mono-, di- andmultivalent radicals, having the number of carbon atoms designated (i.e.C₁-C₁₀ means one to ten carbons). Examples of saturated alkyl radicalsinclude, but are not limited to, groups such as methyl, methylene,ethyl, ethylene, n-propyl, isopropyl, n-butyl, t-butyl, isobutyl,sec-butyl, cyclohexyl, (cyclohexyl)methyl, cyclopropylmethyl, homologsand isomers of, for example, n-pentyl, n-hexyl, n-heptyl, n-octyl, andthe like. An unsaturated alkyl group is one having one or more doublebonds or triple bonds. Examples of unsaturated alkyl groups include, butare not limited to, vinyl, 2-propenyl, crotyl, 2-isopentenyl,2-(butadienyl), 2,4-pentadienyl, 3-(1,4-pentadienyl), ethynyl, 1- and3-propynyl, 3-butynyl, and the higher homologs and isomers. The term“alkyl,” unless otherwise noted, includes “alkylene” and thosederivatives of alkyl defined in more detail below, such as“heteroalkyl.” Alkyl groups, which are limited to hydrocarbon groups,are termed “homoalkyl.”

The term “heteroalkyl,” by itself or in combination with another term,means, unless otherwise stated, a stable straight or branched chain, orcyclic hydrocarbon radical, or combinations thereof, consisting of thestated number of carbon atoms and at least one heteroatom selected fromthe group consisting of O, N, Si and S, and wherein the nitrogen andsulfur atoms may optionally be oxidized and the nitrogen heteroatom mayoptionally be quaternized. The heteroatom(s) O, N and S and Si may beplaced at any interior position of the heteroalkyl group or at theposition at which the alkyl group is attached to the remainder of themolecule. Examples include, but are not limited to, —CH₂—CH₂—O—CH₃,—CH₂—CH₂—NH—CH₃, —CH₂—CH₂—N(CH₃)—CH₃, —CH₂—S—CH₂—CH₃, —CH₂—CH₂,—S(O)—CH₃, —CH₂—CH₂—S(O)₂—CH₃, —CH═CH—O—CH₃, —Si(CH₃)₃, —CH₂—CH═N—OCH₃,and —CH═CH—N(CH₃)—CH₃. Up to two heteroatoms may be consecutive, suchas, for example, —CH₂—NH—OCH₃ and —CH₂—O—Si(CH₃)₃. Similarly, the term“heteroalkylene” by itself or as part of another substituent means adivalent radical derived from heteroalkyl, as exemplified, but notlimited by, —CH₂—CH₂—S—CH₂—CH₂— and —CH₂—S—CH₂—CH₂—NH—CH₂—. Forheteroalkylene groups, heteroatoms can also occupy either or both of thechain termini (e.g., alkyleneoxy, alkylenedioxy, alkyleneamino,alkylenediamino, and the like). Still further, for alkylene andheteroalkylene linking groups, no orientation of the linking group isimplied by the direction in which the formula of the linking group iswritten.

Each of the above terms (e.g., “alkyl” and “heteroalkyl”) are meant toinclude both substituted and unsubstituted forms of the indicatedradical. Preferred substituents for each type of radical are providedbelow.

Substituents for the alkyl and heteroalkyl radicals (including thosegroups often referred to as alkylene, alkenyl, heteroalkylene,heteroalkenyl, alkynyl, cycloalkyl, heterocycloalkyl, cycloalkenyl, andheterocycloalkenyl) can be one or more of a variety of groups selectedfrom, but not limited to: —OR′, ═O, ═NR′, ═N—OR′, —NR′R″, —SR′,-halogen, —SiR′R″R′″, —OC(O)R′, —C(O)R′, —CO₂R′, —CONR′R″, —OC(O)NR′R″,—NR″C(O)R′, —NR′—C(O)NR″R′″, —NR″C(O)₂R′, —NR—C(NR′R″R′″)═NR″″,—NR—C(NR′R″)═NR′″, —S(O)R′, —S(O)₂R′, —S(O)₂NR′R″, —NRSO₂R′, —CN and—NO₂ in a number ranging from zero to (2m′+1), where m′ is the totalnumber of carbon atoms in such radical. R′, R″, R′″ and R″″ eachpreferably independently refer to hydrogen, substituted or unsubstitutedheteroalkyl, substituted or unsubstituted aryl, e.g., aryl substitutedwith 1-3 halogens, substituted or unsubstituted alkyl, alkoxy orthioalkoxy groups, or arylalkyl groups. When a compound of the inventionincludes more than one R group, for example, each of the R groups isindependently selected as are each R′, R″, R′″ and R″″ groups when morethan one of these groups is present. When R′ and R″ are attached to thesame nitrogen atom, they can be combined with the nitrogen atom to forma 5-, 6-, or 7-membered ring. For example, —NR′R″ is meant to include,but not be limited to, 1-pyrrolidinyl and 4-morpholinyl. From the abovediscussion of substituents, one of skill in the art will understand thatthe term “alkyl” is meant to include groups including carbon atoms boundto groups other than hydrogen groups, such as haloalkyl (e.g., —CF₃ and—CH₂CF₃) and acyl (e.g., —C(O)CH₃, —C(O)CF₃, —C(O)CH₂OCH₃, and thelike).

The term “aryl” means, unless otherwise stated, a polyunsaturated,aromatic, hydrocarbon substituent, which can be a single ring ormultiple rings (preferably from 1 to 3 rings), which are fused togetheror linked covalently. The term “heteroaryl” refers to aryl groups (orrings) that contain from one to four heteroatoms selected from N, O, andS, wherein the nitrogen and sulfur atoms are optionally oxidized, andthe nitrogen atom(s) are optionally quaternized. A heteroaryl group canbe attached to the remainder of the molecule through a heteroatom.Non-limiting examples of aryl and heteroaryl groups include phenyl,1-naphthyl, 2-naphthyl, 4-biphenyl, 1-pyrrolyl, 2-pyrrolyl, 3-pyrrolyl,3-pyrazolyl, 2-imidazolyl, 4-imidazolyl, pyrazinyl, 2-oxazolyl,4-oxazolyl, 2-phenyl-4-oxazolyl, 5-oxazolyl, β-isoxazolyl, 4-isoxazolyl,5-isoxazolyl, 2-thiazolyl, 4-thiazolyl, 5-thiazolyl, 2-furyl, 3-furyl,2-thienyl, 3-thienyl, 2-pyridyl, 3-pyridyl, 4-pyridyl, 2-pyrimidyl,4-pyrimidyl, 5-benzothiazolyl, purinyl, 2-benzimidazolyl, 5-indolyl,1-isoquinolyl, 5-isoquinolyl, 2-quinoxalinyl, 5-quinoxalinyl,3-quinolyl, and 6-quinolyl. Substituents for each of the above notedaryl and heteroaryl ring systems are selected from the group ofacceptable substituents described below.

Similar to the substituents described for the alkyl radical,substituents for the aryl and heteroaryl groups are varied and areselected from, for example: halogen, OR′, —NR′R″, —SR′, -halogen,—SiR′R″R′″, OC(O)R′, —C(O)R′, CO₂R′, —CONR′R″, —OC(O)NR′R″, —NR″C(O)R′,NR′ C(O)NR″R′″, —NR″C(O)₂R′, NR—C(NR′R″R′″)═NR″″, NR C(NR′R″)═NR′″,—S(O)R′, —S(O)₂R′, —S(O)₂NR′R″, NRSO₂R′, —CN and —NO₂, —R′, —N₃,—CH(Ph)₂, fluoro(C₁-C₄)alkoxy, and fluoro(C₁-C₄)alkyl, in a numberranging from zero to the total number of open valences on the aromaticring system. When a compound of the invention includes more than one Rgroup, for example, each of the R groups is independently selected asare each R′, R″, R′″ and R″″ groups when more than one of these groupsis present.

Two of the substituents on adjacent atoms of the aryl or heteroaryl ringmay optionally be replaced with a substituent of the formula-T-C(O)—(CRR′)_(q)—U—, wherein T and U are independently —NR—, —O—,—CRR′— or a single bond, and q is an integer of from 0 to 40.Alternatively, two of the substituents on adjacent atoms of the aryl orheteroaryl ring may optionally be replaced with a substituent of theformula -A (CH₂)_(r)B—, wherein A and B are independently —CRR′—, —O—,—NR—, —S—, —S(O)—, S(O)₂, —S(O)₂NR′— or a single bond, and r is aninteger of from 1 to 40. One of the single bonds of the new ring soformed may optionally be replaced with a double bond. Alternatively, twoof the substituents on adjacent atoms of the aryl or heteroaryl ring mayoptionally be replaced with a substituent of the formula—(CRR′)_(s)—X—(CR″R′″)_(d)—, where s and d are independently integers offrom 0 to 40, and X is —O—, —NR′—, —S—, —S(O)—, —S(O)₂—, or —S(O)₂NR′—.The substituents R, R′, R″ and R′″ are preferably independently selectedfrom hydrogen or substituted or unsubstituted (C₁-C₄₀)alkyl.

The term “detectable label” refers to a moiety renders a molecule towhich it is attached to detectable by a variety of mechanisms includingchemical, enzymatic, immunological, or radiological means. Some examplesof detectable labels include fluorescent molecules (such as fluorescein,rhodamine, Texas Red, and phycoerythrin) and enzyme molecules (such ashorseradish peroxidase, alkaline phosphatase, and β-galactosidase) thatallow detection based on fluorescence emission or a product of achemical reaction catalyzed by the enzyme. Radioactive labels involvingvarious isotopes, such as ³H, ¹²⁵I, ³⁵S, ¹⁴C, or ³²P, can also beattached to appropriate molecules to enable detection by any suitablemethods that registers radioactivity, such as autoradiography. See,e.g., Tijssen, “Practice and Theory of Enzyme Immunoassays,” LaboratoryTechniques in Biochemistry and Molecular Biology, Burdon and vanKnippenberg Eds., Elsevier (1985), pp. 9-20. An introduction to labels,labeling procedures, and detection of labels can also be found in Polakand Van Noorden, Introduction to Immunocytochemistry, 2d Ed., SpringerVerlag, N Y (1997); and in Haugland, Handbook of Fluorescent Probes andResearch Chemicals, a combined handbook and catalogue published byMolecular Probes, Inc. (1996).

The term “targeting moiety,” as used herein, refers to species that willselectively localize in a particular tissue or region of the body. Thelocalization is mediated by specific recognition of moleculardeterminants, molecular size of the targeting agent or conjugate, ionicinteractions, hydrophobic interactions and the like. Other mechanisms oftargeting an agent to a particular tissue or region are known to thoseof skill in the art. Exemplary targeting moieties include antibodies,antibody fragments, transferrin, HS-glycoprotein, coagulation factors,serum proteins, β-glycoprotein, G-CSF, GM-CSF, M-CSF, EPO, saccharides,lectins, receptors, ligand for receptors, proteins such as BSA and thelike. The targeting group can also be a small molecule, a term that isintended to include both non-peptides and peptides.

The symbol

, whether utilized as a bond or displayed perpendicular to a bondindicates the point at which the displayed moiety is attached to theremainder of the molecule, solid support, etc.

The term “increase,” as used herein, refers to a detectable positivechange in quantity of a parameter when compared to a standard. The levelof this positive change, for example, in the synthetic activity of amutant endoglycoceramidase from its corresponding wild-typeendoglycoceramidase, is preferably at least 10% or 20%, and morepreferably at least 30%, 40%, 50%, 60% or 80%, and most preferably atleast 100%.

The term “reduce” or “decrease” is defined as a detectable negativechange in quantity of a parameter when compared to a standard. The levelof this negative change, for example, in the hydrolytic activity of amutant endoglycoceramidase from its corresponding wild-typeendoglycoceramidase, is preferably at least 10% or 20%, and morepreferably at least 30%, 40%, 50%, 60%, 80%, 90%, and most preferably atleast 100%.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A-1B set forth compounds that can be made using the enzyme of theinvention.

FIG. 2 sets forth compounds that can be made using the enzyme of theinvention.

FIGS. 3A-3B set forth compounds that can be made using the enzyme of theinvention.

FIG. 4 sets forth compounds that can be made using the enzyme of theinvention.

FIG. 5 sets forth compounds that can be made using the enzyme of theinvention.

FIG. 6 sets forth compounds that can be made using the enzyme of theinvention.

FIG. 7 sets forth compounds that can be made using the enzyme of theinvention.

FIGS. 8A-8B set forth compounds that can be made using the enzyme of theinvention.

FIG. 9 sets forth compounds that can be made using the enzyme of theinvention.

FIGS. 10A-10B set forth compounds that can be made using the enzyme ofthe invention.

FIG. 11 sets forth compounds that can be made using the enzyme of theinvention.

FIGS. 12A-12C set forth compounds that can be made using the enzyme ofthe invention.

FIGS. 13A-13C set forth compounds that can be made using the enzyme ofthe invention.

FIG. 14 is a schematic depiction of expression vector pT7-7, indicatingrestriction enzyme sites.

FIGS. 15A-15C illustrate an amino acid sequence alignment of wild-typeendoglycoceramidases from Rhodococcus, Propionibacterium, Cyanea, andHydra.

FIG. 16 illustrates SDS-PAGE analysis of EGCase purification. Lanes: 1)insoluble pellet fraction; 2) lysate soluble fraction; and 3) purifiedfraction.

FIG. 17 illustrates a reaction analysis by HPLC showing the synthesis ofLyso-GM3 after 12 hrs. Top panels: control runs. Bottom panels: reactionruns.

FIG. 18 illustrates a Michaelis-Menten curve for wild-type RhodococcusEGC using 2,4-dinitrophenyl lactoside as the substrate.

FIGS. 19A-19C illustrate variation of kcat, Km, and kcat/Km withincreasing detergent concentration for wild-type Rhodococcus EGC.

FIG. 20 illustrates pH rate profile for wild-type Rhodococcus EGC.Estimated pKa values for the catalytic glutamate residues are 3.2 and6.5.

FIG. 21 illustrates expression in E. coli of Propionibacterium acneswild-type EGC under a variety of conditions. In each series of threelanes, the first shows the pre-induction expression level, the secondthe total cell fraction after induction, and the third the solublefraction of the cell lysate. In all cases, induction was performed at18° C. Lanes 1-3: BL21 pLysS, 0.1 mM IPTG, M9 media. Lanes 4-6: Tuner,0.1 mM IPTG, M9. Lanes 7-9: BL21 pLysS, 0.01 mM IPTG, Typ media. Lanes10-12: Tuner, 0.1 mM IPTG, Typ media. Lane 14: Molecular weightstandards.

DETAILED DESCRIPTION Introduction

Glycolipids, each consisting of a saccharide moiety and a heteroalkylmoiety, e.g., Formula Ia, Formula Ib, Formula II or Formula III, areimportant constituents of cellular membranes. With their diverse sugargroups extruding outward from the membrane surface, glycolipids mediatecell growth and differentiation, recognize hormones and bacterialtoxins, and determine antigenicity; some are recognized astumor-associated antigens (Hakomori, Annu. Rev. Biochem., 50:733-764,1981; Marcus, Mol. Immunol. 21:1083-1091, 1984). The present inventiondiscloses novel enzymes and methods for producing glycolipids having asaccharyl moiety of virtually any structure, making it possible to studythese important molecules and develop therapeutics, e.g., anti-tumoragents, targeting certain glycolipids.

Mutant Endoglycoceramidases

The present invention provides mutant endoglycoceramidases, also termed“endoglycoceramide synthases,” which have an increased syntheticactivity for attaching a donor substrate comprising a saccharide moietyto an acceptor substrate (an aglycone) compared to the correspondingwild-type endoglycoceramidase. The mutant endoglycoceramidases can alsohave a reduced hydrolytic activity towards glycolipids compared to thecorresponding wild-type endoglycoceramidase. Corresponding wild-typeendoglycoceramidases have at least two identifiable conserved motifs,including an acid-base region(Val-X₁-(Ala/Gly)-(Tyr/Phe)-(Asp/Glu)-(Leu/Ile)-X₂-Asn-Glu-Pro-X₃-X₄-Glyor motif B or SEQ ID NO:51), and a nucleophilic region((Ile/Met/Leu/Phe/Val)-(Leu/Met/Ile/Val)-(Gly/Ser/Thr)-Glu-(Phe/Thr/Met/Leu)-(Gly/Leu/Pheor motif D or SEQ ID NO:53), and hydrolyze the glycoside linkage betweena sugar chain and a lipid moiety in a glycolipid.

Structurally, the invention provides a mutant endoglycoceramidase havinga modified nucleophilic carboxylate Glu/Asp residue, wherein thenucleophilic Glu/Asp resides within a(Ile/Met/Leu/Phe/Val)-(Leu/Met/Ile/Val)-(Gly/Ser/Thr)-(Glu/Asp)-(Phe/Thr/Met/Leu)-(Gly/Leu/Phe)sequence (SEQ ID NO:54) of a corresponding wild-typeendoglycoceramidase, wherein the mutant endoglycoceramidase catalyzesthe transfer of a saccharide moiety from a donor substrate to anacceptor substrate.

In a further aspect, the invention provides a mutant endoglycoceramidasehaving a modified Glu residue within the subsequence of Asn-Glu-Pro,wherein the subsequence resides within the acid-base sequence region ofVal-X₁-(Ala/Gly)-(Tyr/Phe)-(Asp/Glu)-(Leu/Ile)-X₂-Asn-Glu-Pro-X₃-X₄-Glysequence in the corresponding wild-type protein, wherein the mutantendoglycoceramidase catalyzes the transfer of a saccharide moiety from adonor substrate to an acceptor substrate.

In a related aspect, the invention provides a mutant endoglycoceramidasecharacterized in that

-   -   i) in its native form the endoglycoceramidase comprises an amino        acid sequence that is any one of SEQ ID NOs: 2 (Rhodococcus), 4        (Rhodococcus), 6 (Propionibacterium acnes), 8 (Propionibacterium        acnes), 10 (Cyanea nozakii), 12 (Cyanea nozakii), 14 (Hydra        magnipapillata), 16 (Schistosoma japonicum), 17 (Dictyostelium        discoideum), 18 (Streptomyces avermitilis), 19 (Leptospira        interrogans), and 20 (Neurospora crassa); and    -   ii) the nucleophilic Glu/Asp residue within a        (Ile/Met/Leu/Phe/Val)-(Leu/Met/Ile/Val)-(Gly/Ser/Thr)-Glu/Asp-(Phe/Thr/Met/Leu)-(Gly/Leu/Phe)        sequence of a corresponding wild-type endoglycoceramidase is        modified to an amino acid other than Glu/Asp.

In a further aspect, the invention provides a mutant endoglycoceramidasecharacterized in that

-   -   i) in its native form the endoglycoceramidase comprises an amino        acid sequence that is any one of SEQ ID NOs: 2 (Rhodococcus), 4        (Rhodococcus), 6 (Propionibacterium acnes), 8 (Propionibacterium        acnes), 10 (Cyanea nozakii), 12 (Cyanea nozakii), 14 (Hydra        magnipapillata), 16 (Schistosoma japonicum), 17 (Dictyostelium        discoideum), 18 (Streptomyces avermitilis), 19 (Leptospira        interrogans), and 20 (Neurospora crassa); and    -   ii) the Glu residue within the subsequence of Asn-Glu-Pro of the        acid-base sequence region        Val-X₁-(Ala/Gly)-(Tyr/Phe)-(Asp/Glu)-(Leu/Ile)-X₂-Asn-Glu-Pro-X₃-X₄-Gly        in the corresponding wild-type protein is modified to an amino        acid other than Glu.

Typically, the mutant endoglycoceramidases of the present inventioncomprise a modified nucleophilic Glu/Asp residue and/or a modifiedacid-base sequence region Glu residue within the Asn-Glu-Pro subsequenceof a corresponding wild-type endoglyoceramidase. One or both of the Gluresidues are deleted or replaced with another chemical moiety thatretains the integral structure of the protein such that the mutantenzyme has synthetic activity. For example, one or more of thenucleophilic and/or acid-base sequence region Glu residues (i.e., in theAsn-Glu-Pro subsequence region) can be replaced with an L-amino acidresidue other than Glu, an unnatural amino acid, an amino acid analog,an amino acid mimetic, and the like. Usually, the one or more Gluresidues are substituted with another L-amino acid other than Glu, forexample, Gly, Ala, Ser, Asp, Asn, Gln, Cys, Thr, Ile, Leu or Val.

Functionally, the invention provides mutant endoglycoceramidases havinga synthetic activity of coupling a glycosyl moiety and an aglyconesubstrate, forming a glycolipid. The mutant endoglycoceramidase can alsohave a reduced hydrolytic activity towards glycolipids compared to thecorresponding wild-type endoglycoceramidase. The mutantendoglycoceramidases of the invention have a synthetic activity that isgreater than the synthetic activity of the corresponding wild typeendoglycoceramidase. Preferably, the synthetic activity is greater thanits degradative (i.e., hydrolytic) activity in an assay. The assay forthe synthetic activity of the mutant endoglycoceramidase comprisestransferring a glycosyl moiety from a glycosyl donor substrate for saidmutant to an aglycone (i.e., acceptor substrate). The synthetic activitycan be readily measured in an assay designed to detect the rate ofglycolipid synthesis by the mutant or the quantity of productsynthesized by the enzyme.

In general, preferred mutant endoglycoceramidases of the invention areat least about 1.5-fold more synthetically active than their wild typeanalogues, more preferably, at least about 2-fold, at least about5-fold, at least about 10-fold, at least about 20-fold, a least about50-fold and more preferably still, at least about 100-fold. By moresynthetically active is meant that the rate of starting materialconversion by the enzyme is greater than that of the corresponding wildtype enzyme and/or the amount of product produced within a selected timeis greater than that produced by the corresponding wild type enzyme in asimilar amount of time. A useful assay for determining enzyme syntheticactivity includes transferring a glycosyl moiety from a glycosyl donorsubstrate for said mutant to an aglycone.

The corresponding wild-type endoglycoceramidase can be from aprokaryotic organism (e.g., a Rhodococcus, a Propionibacterium, aStreptomyces, or a Leptospira) or a eukaryotic organism (e.g., a Cyanea,a Hydra, a Schistosoma, a Dictyostelium, a Neurospora). For example, thecorresponding wild-type or native endoglycoceramidase can be from anActinobacteria, including a Rhodococcus, a Propionibacterium, or aStreptomyces. The corresponding wild-type or native endoglycoceramidasealso can be from a Metazoan, including a Cyanea, a Hydra, or aSchistosoma, or from a Cnidaria, including a Cyanea or a Hydra. Thecorresponding wild-type or native endoglycoceramidase also can be from aMycetozoa (e.g., a Dictyostelium), a Spirochete (e.g., a Leptospira), ora fungus, such as an Ascomycete (e.g., a Neurospora). In one embodiment,the corresponding wild-type endoglycoceramidase has an amino acidsequence of any one of SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16, 17, 18,19, or 20. In one embodiment, the corresponding wild-typeendoglycoceramidase is encoded by a nucleic acid sequence of any one ofSEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, or 15.

The corresponding wild-type endoglycoceramidase can be from any knownendoglycoceramidase sequence or any endoglycoceramidase sequence whichhas yet to be determined. Additional corresponding wild-typeendoglycoceramidases can be identified using sequence databases andsequence alignment algorithms, for example, the publicly availableGenBank database and the BLAST alignment algorithm, available on theworldwide web through ncbi.nlm.nih.gov. Additional correspondingwild-type endoglycoceramidases also can be found using routinetechniques of hybridization and recombinant genetics. Basic textsdisclosing the general methods of use in this invention include Sambrookand Russell, Molecular Cloning, A Laboratory Manual (3rd ed. 2001);Kriegler, Gene Transfer and Expression: A Laboratory Manual (1990); andAusubel et al., eds., Current Protocols in Molecular Biology (1994).Native or wild-type endoglycoceramidases of interest include thoseencoded by nucleic acid sequences that hybridize under stringenthybridization conditions to one or more of SEQ ID NOs: 1, 3, 5, 7, 9,11, 13, or 15. Native or wild-type endoglycoceramidases of interest alsoinclude those with one or more conservatively substituted amino acids orwith at least about 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% sequenceidentity to one or more of SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, or 16-20.

Wild-type and mutant endoglycoceramidases can be further characterizedby a(Met/Val/Leu)-Leu-Asp-(Met-Phe-Ala)-His-Gln-Asp-(Met/Val/Leu)-X-(Ser/Asn)motif (motif A or SEQ ID NO:50) located N-terminal to the acid-basesequence region and a C-terminal Ala-Ile-Arg-(Gln/Ser/Thr)-Val-Asp motif(motif C or SEQ ID NO:52) located C-terminal to the acid-base sequenceregion. For example, the(Met/Val/Leu)-Leu-Asp-(Met-Phe-Ala)-His-Gln-Asp-(Met/Val/Leu) motif islocated at residues 131-140 in Rhodococcus sp. M-777; at residues129-138 in Rhodococcus sp. C9; at residues 136-145 in Propionibacteriumacnes EGCa; at residues 153-162 in Propionibacterium acnes EGCb; atresidues 130-139 in Cyanea nozakii; and at residues 121-130 in Hydramagnipapillata. The Ala-Ile-Arg-(Gln/Ser/Thr)-Val-Asp motif is locatedat residues 259-264 in Rhodococcus sp. M-777; at residues 250-255 inRhodococcus sp. C9; at residues 262-267 in Propionibacterium acnes EGCa;at residues 280-285 in Propionibacterium acnes EGCb; at residues 272-277in Cyanea nozakii; and at residues 263-268 in Hydra magnipapillata.

To enhance expression of a mutant endoglycoceramidase in the solublefraction of a bacterial host cell, the mutant endoglycoceramidasestypically have had removed the native N-terminal signal peptide sequencethat is expressed in the corresponding wild-type enzyme. The signalpeptide sequence is typically found within the N-terminal 15, 20, 25,30, 35, 40, 40, 45, 50 or 55 amino acid residues of a correspondingwild-type endoglycoceramidase. Predicted native N-terminal signalpeptide sequences for wild-type endoglycoceramidases from Rhodococcus,Propionibacter, Cyanea, Hydra, Schistosoma, Dyctyostelium, Streptomyces,and Neurospora species are shown in SEQ ID NOs:59-68.

In addition to the amino acid sequences that comprise the mutantendoglycoceramidases, the present invention also includes nucleic acidsequences encoding a mutant endoglycoceramidase, expression vectorscomprising such nucleic acid sequences, and host cells that comprisesuch expression vectors.

Cloning and Subcloning of a Wild-type Endoglycoceramidase CodingSequence

A number of polynucleotide sequences encoding wild-typeendoglycoceramidases, e.g., GenBank Accession No. U39554, have beendetermined and can be synthesized or obtained from a commercialsupplier, such as Blue Heron Biotechnology (Bothell, Wash.).

The rapid progress in the studies of organism genomes has made possiblea cloning approach where an organism DNA sequence database can besearched for any gene segment that has a certain percentage of sequencehomology to a known nucleotide sequence, such as one encoding apreviously identified endoglycoceramidase. Any DNA sequence soidentified can be subsequently obtained by chemical synthesis and/or apolymerase chain reaction (PCR) technique such as overlap extensionmethod. For a short sequence, completely de novo synthesis may besufficient; whereas further isolation of full length coding sequencefrom a human cDNA or genomic library using a synthetic probe may benecessary to obtain a larger gene.

Alternatively, a nucleic acid sequence encoding an endoglycoceramidasecan be isolated from a cDNA or genomic DNA library using standardcloning techniques such as polymerase chain reaction (PCR), wherehomology-based primers can often be derived from a known nucleic acidsequence encoding an endoglycoceramidase. Most commonly used techniquesfor this purpose are described in standard texts, e.g., Sambrook andRussell, supra.

cDNA libraries suitable for obtaining a coding sequence for a wild-typeendoglycoceramidase may be commercially available or can be constructed.The general methods of isolating mRNA, making cDNA by reversetranscription, ligating cDNA into a recombinant vector, transfectinginto a recombinant host for propagation, screening, and cloning are wellknown (see, e.g., Gubler and Hoffman, Gene, 25: 263-269 (1983); Ausubelet al., supra). Upon obtaining an amplified segment of nucleotidesequence by PCR, the segment can be further used as a probe to isolatethe full length polynucleotide sequence encoding the wild-typeendoglycoceramidase from the cDNA library. A general description ofappropriate procedures can be found in Sambrook and Russell, supra.

A similar procedure can be followed to obtain a full length sequenceencoding a wild-type endoglycoceramidase from a genomic library. Genomiclibraries are commercially available or can be constructed according tovarious art-recognized methods. In general, to construct a genomiclibrary, the DNA is first extracted from an organism where anendoglycoceramidase is likely found. The DNA is then either mechanicallysheared or enzymatically digested to yield fragments of about 12-20 kbin length. The fragments are subsequently separated by gradientcentrifugation from polynucleotide fragments of undesired sizes and areinserted in bacteriophage λ vectors. These vectors and phages arepackaged in vitro. Recombinant phages are analyzed by plaquehybridization as described in Benton and Davis, Science, 196: 180-182(1977). Colony hybridization is carried out as described by Grunstein etal., Proc. Natl. Acad. Sci. USA, 72: 3961-3965 (1975).

Based on sequence homology, degenerate oligonucleotides can be designedas primer sets and PCR can be performed under suitable conditions (see,e.g., White et al., PCR Protocols: Current Methods and Applications,1993; Griffin and Griffin, PCR Technology, CRC Press Inc. 1994) toamplify a segment of nucleotide sequence from a cDNA or genomic library.Using the amplified segment as a probe, the full length nucleic acidencoding a wild-type endoglycoceramidase is obtained. Oligonucleotidesthat are not commercially available can be chemically synthesized, e.g.,according to the solid phase phosphoramidite triester method firstdescribed by Beaucage & Caruthers, Tetrahedron Lett. 22: 1859-1862(1981), using an automated synthesizer, as described in Van Devanter et.al., Nucleic Acids Res. 12: 6159-6168 (1984). Purification ofoligonucleotides is performed using any art-recognized strategy, e.g.,native acrylamide gel electrophoresis or anion-exchange HPLC asdescribed in Pearson & Reanier, J. Chrom. 255: 137-149 (1983).

Upon acquiring a nucleic acid sequence encoding a wild-typeendoglycoceramidase, the coding sequence can be subcloned into a vector,for instance, an expression vector, so that a recombinantendoglycoceramidase can be produced from the resulting construct.Further modifications to the wild-type endoglycoceramidase codingsequence, e.g., nucleotide substitutions, may be subsequently made toalter the characteristics of the enzyme.

Methods for Producing Mutant Endoglycoceramidases

In one aspect, the invention provides a method for generating a mutantendoglycoceramidase having a synthetic activity of coupling a saccharideand a substrate and forming glycolipids compared to the correspondingwild-type endoglycoceramidases. The mutant endoglycoceramidase can alsohave a reduced hydrolytic activity towards glycolipids compared to thecorresponding wild-type endoglycoceramidase. The method includesselectively conferring synthetic activity and/or disrupting thehydrolytic activity of the corresponding wild-type endoglycoceramidase.Synthetic activity can be conferred by modifying the nucleophiliccarboxylate amino acid residue (i.e., a Glu or an Asp) of acorresponding wild-type endoglycoceramidase.

Accordingly, in one aspect, the invention provides a method for making amutant endoglycoceramidase having enhanced synthetic activity incomparison to a corresponding wild-type endoglycoceramidase, the methodcomprising modifying the nucleophilic carboxylate amino acid residue ina corresponding wild-type endoglycoceramidase, wherein the nucleophiliccarboxylate amino acid residue resides within a(Ile/Met/Leu/Phe/Val)-(Leu/Met/Ile/Val)-(Gly/Ser/Thr)-(Glu/Asp)-(Phe/Thr/Met/Leu)-(Gly/Leu/Phe)sequence (SEQ ID NO:54) of a corresponding wild-typeendoglycoceramidase.

In carrying out the methods of producing a mutant endoglycoceramidase,one or both of the nucleophilic carboxylate amino acid residues (i.e., aGlu or an Asp) and/or acid-base sequence region Glu residues of acorresponding endoglycoceramidase can be deleted or replaced withanother chemical moiety that retains the integral structure of theprotein such that the mutant enzyme has synthetic activity. For example,one or more of the nucleophilic and/or hydrolytic Glu or Asp residuescan be replaced with an L-amino acid residue other than Glu or Asp, aD-amino acid residue (including a D-Glu or a D-Asp), an unnatural aminoacid, an amino acid analog, an amino acid mimetic, and the like.Usually, the one or more Glu or Asp residues are substituted withanother L-amino acid other than Glu or Asp, for example, Gly, Ala, Ser,Asp, Asn, Glu, Gln, Cys, Thr, Ile, Leu or Val.

Introducing Mutations into the Endoglycoceramidase Coding Sequence

Modifications altering the enzymatic activity of an endoglycoceramidasemay be made in various locations within the polynucleotide codingsequence. The preferred locations for such modifications are, however,within the nucleophilic site and the acid-base sequence region of theenzyme. Conserved regions likely to contain important residues forstructure or native enzymatic activity can be identified by aligningamino acid sequences of wild-type endoglycoceramidases from differentorganisms. Such amino acid sequences are readily available on publicdatabases, including GenBank. Alignment of endoglycoceramidase sequenceswith an endoglycoceramidase sequence where the nucleophilic residue hasbeen identified allows for the identification of the nucleophilicresidue in subsequent sequences. Alternatively, the nucleophilic residuecan be identified (or confirmed) via a fluorosugar labeling strategy(see, U.S. Pat. No. 5,716,812).

From an encoding nucleic acid sequence, the amino acid sequence of awild-type endoglycoceramidase, e.g., SEQ ID NOs:2, 4, 6, 8, 10, 12, 14,16-20 can be deduced and the presence of a nucleophilic region or anacid-base region can be confirmed. Preferably, mutations are introducedinto the nucleophilic region or the acid-base region. For instance, theGlu residue located in the middle of the three-amino acid segmentAsn-Glu-Pro of the acid-base sequence region, can be targeted formutation, such as deletion or substitution by another amino acidresidue. In addition, the nucleophilic carboxylate (i.e., Glu or Asp)residue (bolded) in the(Ile/Met/Leu/Phe/Val)-(Leu/Met/Ile/Val)-(Gly/Ser/Thr)-Glu/Asp-(Phe/Thr/Met/Leu)-(Gly/Leu/Phe)motif of a corresponding wild-type endoglycoceramidase is also a targetfor introducing mutations to alter the enzymatic activity of anendoglycoceramidase. An artisan can accomplish the goal of mutating atarget Glu residue by employing any one of the well known mutagenesismethods, which are discussed in detail below. Exemplary modificationsare introduced to replace the Glu residue with another amino acidresidue as depicted in SEQ ID NOs:29-33.

Modifications can be directed to the nucleic acid sequence encoding awild-type or mutant endoglycoceramidase or to one or more amino acids ofan endoglycoceramidase enzyme. Typically, modifications are directed toone or more nucleic acid codons encoding one or both of the nucleophilicsite and the acid-base sequence region. For example, one or more nucleicacids in the codon encoding for the Glu residue in the acid-basesequence region are modified such that the codon encodes for an aminoacid other than Glu, for example, Gly, Ala, Ser, Asp, Asn, Gln, Cys,Thr, Ile, Leu or Val. In another example, one or more nucleic acids inthe codon encoding for the Glu residue in the nucleophilic site aremodified such that the codon encodes for an amino acid other than Glu,for example, Gly, Ala, Ser, Asp, Asn, Gln, Cys, Thr, Ile, Leu or Val.Site-directed modifications to wild-type or mutant endoglycoceramidasenucleic acid sequences can be introduced using methods well-known in theart, including overlapping PCR or overlap extension PCR (see, forexample, Aiyar, et al., Methods Mol Biol (1996) 57:177-91; and Pogulis,et al., Methods Mol Biol (1996) 57:167-76). Suitable PCR primers can bedetermined by one of skill in the art using the sequence informationprovided in GenBank or other sources. Services for large-scalesite-directed mutagenesis of a desired sequence are commerciallyavailable, for example, from GeneArt of Toronto, Canada.

In addition, a variety of diversity-generating protocols are establishedand described in the art. See, e.g., Zhang et al., Proc. Natl. Acad.Sci. USA, 94: 4504-4509 (1997); and Stemmer, Nature, 370: 389-391(1994). The procedures can be used separately or in combination toproduce variants of a set of nucleic acids, and hence variants ofencoded polypeptides. Kits for mutagenesis, library construction, andother diversity-generating methods are commercially available.

Mutational methods of generating diversity include, for example,site-directed mutagenesis (Botstein and Shortle, Science, 229: 1193-1201(1985)), mutagenesis using uracil-containing templates (Kunkel, Proc.Natl. Acad. Sci. USA, 82: 488-492 (1985)), oligonucleotide-directedmutagenesis (Zoller and Smith, Nucl. Acids Res., 10: 6487-6500 (1982)),phosphorothioate-modified DNA mutagenesis (Taylor et al., Nucl. AcidsRes., 13: 8749-8764 and 8765-8787 (1985)), and mutagenesis using gappedduplex DNA (Kramer et al., Nucl. Acids Res., 12: 9441-9456 (1984)).

Other possible methods for generating mutations include point mismatchrepair (Kramer et al., Cell, 38: 879-887 (1984)), mutagenesis usingrepair-deficient host strains (Carter et al., Nucl. Acids Res., 13:4431-4443 (1985)), deletion mutagenesis (Eghtedarzadeh and Henikoff,Nucl. Acids Res., 14: 5115 (1986)), restriction-selection andrestriction-purification (Wells et al., Phil. Trans. R. Soc. Lond. A,317: 415-423 (1986)), mutagenesis by total gene synthesis (Nambiar etal., Science, 223: 1299-1301 (1984)), double-strand break repair(Mandecki, Proc. Natl. Acad. Sci. USA, 83: 7177-7181 (1986)),mutagenesis by polynucleotide chain termination methods (U.S. Pat. No.5,965,408), and error-prone PCR (Leung et al., Biotechniques, 1: 11-15(1989)).

At the completion of modification, the mutant endoglycoceramidase codingsequences can then be subcloned into an appropriate vector forrecombinant production in the same manner as the wild-type genes.

Modification of Nucleic Acids for Preferred Codon Usage in a HostOrganism

The polynucleotide sequence encoding an endoglycoceramidase (eitherwild-type or mutant) can be altered to coincide with the preferred codonusage of a particular host. For example, the preferred codon usage ofone strain of bacteria can be used to derive a polynucleotide thatencodes a mutant endoglycoceramidase of the invention and includes thecodons favored by this strain. The frequency of preferred codon usageexhibited by a host cell can be calculated by averaging frequency ofpreferred codon usage in a large number of genes expressed by the hostcell (e.g., calculation service is available from web site of the KazusaDNA Research Institute, Japan). This analysis is preferably limited togenes that are highly expressed by the host cell. U.S. Pat. No.5,824,864, for example, provides the frequency of codon usage by highlyexpressed genes exhibited by dicotyledonous plants and monocotyledonousplants. Services for the creation of nucleic acid sequences of preferredcodon usage for optimized expression in cells of a particular desiredorganism (e.g., bacteria, yeast, insect, mammalian) can be commerciallypurchased, for example, from Blue Heron Biotechnology, Bothell, Wash.

The sequences of the cloned endoglycoceramidase genes, syntheticpolynucleotides, and modified endoglycoceramidase genes can be verifiedusing, e.g., the chain termination method for sequencing double-strandedtemplates as described in Wallace et al., Gene 16:21-26 (1981).

Expression of the Endoglycoceramidases

Following sequence verification, the wild-type or mutantendoglycoceramidase of the present invention can be produced usingroutine techniques in the field of recombinant genetics, relying on thepolynucleotide sequences encoding the polypeptide disclosed herein.

Expression Systems

To obtain high level expression of a nucleic acid encoding a wild-typeor a mutant endoglycoceramidase of the present invention, one typicallysubclones a polynucleotide encoding the endoglycoceramidase into anexpression vector that contains a strong promoter to directtranscription, a transcription/translation terminator and a ribosomebinding site for translational initiation. Suitable bacterial promotersare well known in the art and described, e.g., in Sambrook and Russell,supra, and Ausubel et al., supra. Bacterial expression systems forexpressing the wild-type or mutant endoglycoceramidase are available in,e.g., E. coli, Bacillus sp., Salmonella, and Caulobacter. Kits for suchexpression systems are commercially available. Eukaryotic expressionsystems for mammalian cells, yeast, and insect cells are well known inthe art and are also commercially available. For example, Pichia andBaculovirus expression systems can be purchased from Invitrogen(Carlsbad, Calif.). Pichia expression systems are also available forpurchase from Research Corporation Technologies of Tucson, Ariz.Mammalian cells for heterologous polypeptide expression can be purchasedfrom the American Type Culture Collection (ATCC) in Manassas, Va. andexpression systems are commercially available, for example, from NewEngland Biolabs, Beverly, Mass. In one embodiment, the eukaryoticexpression vector is an adenoviral vector, an adeno-associated vector,or a retroviral vector.

The host cells are preferably microorganisms, such as, for example,yeast cells, bacterial cells, or filamentous fungal cells. Examples ofsuitable host cells include, for example, Azotobacter sp. (e.g., A.vinelandii), Pseudomonas sp., Rhizobium sp., Erwinia sp., Escherichiasp. (e.g., E. coli), Bacillus, Pseudomonas, Proteus, Salmonella,Serratia, Shigella, Rhizobia, Vitreoscilla, Paracoccus and Klebsiellasp., among many others. The cells can be of any of several genera,including Saccharomyces (e.g., S. cerevisiae), Candida (e.g., C. utilis,C. parapsilosis, C. krusei, C. versatilis, C. lipolytica, C.zeylanoides, C. guilliermondii, C. albicans, and C. humicola), Pichia(e.g., P. farinosa and P. ohmeri), Torulopsis (e.g., T. candida, T.sphaerica, T. xylinus, T. famata, and T. versatilis), Debaryomyces(e.g., D. subglobosus, D. cantarellii, D. globosus, D. hansenii, and D.japonicus), Zygosaccharomyces (e.g., Z. rouxii and Z. bailii),Kluyveromyces (e.g., K. marxianus), Hansenula (e.g., H. anomala and H.jadinii), and Brettanomyces (e.g., B. lambicus and B. anomalus).Examples of useful bacteria include, but are not limited to,Escherichia, Enterobacter, Azotobacter, Erwinia, Klebsielia, Bacillus,Pseudomonas, Proteus, and Salmonella. Suitable mammalian cells forexpression include Chinese Hamster Ovary (CHO) cells, human epithialkidney (HEK)293 cells, and NIH 3T3 cells.

A construct that includes a polynucleotide of interest operably linkedto gene expression control signals that, when placed in an appropriatehost cell, drive expression of the polynucleotide is termed an“expression cassette.” A typical expression cassette generally containsa promoter operably linked to the nucleic acid sequence encoding thewild-type or mutant endoglycoceramidase and signals required forefficient polyadenylation of the transcript, ribosome binding sites, andtranslation termination. Accordingly, the invention provides expressioncassettes into which the nucleic acids that encode fusion proteins areincorporated for high level expression in a desired host cell. Thenucleic acid sequence encoding the endoglycoceramidase is typicallylinked to a cleavable signal peptide sequence to promote secretion ofthe endoglycoceramidase by the transformed cell. Such signal peptidesinclude, among others, the signal peptides from tissue plasminogenactivator, insulin, and neuron growth factor, and juvenile hormoneesterase of Heliothis virescens. Additional elements of the cassette mayinclude enhancers and, if genomic DNA is used as the structural gene,introns with functional splice donor and acceptor sites.

Typically, the polynucleotide that encodes the wild-type or mutantendoglycoceramidase polypeptides is placed under the control of apromoter that is functional in the desired host cell. An extremely widevariety of promoters are well known, and can be used in the expressionvectors of the invention, depending on the particular application.Ordinarily, the promoter selected depends upon the cell in which thepromoter is to be active. Other expression control sequences such asribosome binding sites, transcription termination sites and the like arealso optionally included.

Expression control sequences that are suitable for use in a particularhost cell are often obtained by cloning a gene that is expressed in thatcell. Commonly used prokaryotic control sequences, which are definedherein to include promoters for transcription initiation, optionallywith an operator, along with ribosome binding site sequences, includesuch commonly used promoters as the beta-lactamase (penicillinase) andlactose (lac) promoter systems (Change et al., Nature (1977) 198: 1056),the tryptophan (trp) promoter system (Goeddel et al., Nucleic Acids Res.(1980) 8: 4057), the tac promoter (DeBoer, et al., Proc. Natl. Acad.Sci. U.S.A. (1983) 80:21-25); and the lambda-derived P_(L) promoter andN-gene ribosome binding site (Shimatake et al., Nature (1981) 292: 128).The particular promoter system is not critical to the invention, anyavailable promoter that functions in prokaryotes can be used.

For expression of endoglycoceramidase proteins in host cells other thanE. coli, a promoter that functions in the particular prokaryotic speciesis required. Such promoters can be obtained from genes that have beencloned from the species, or heterologous promoters can be used. Forexample, the hybrid trp-lac promoter functions in Bacillus in additionto E. coli.

A ribosome binding site (RBS) is conveniently included in the expressioncassettes of the invention. An RBS in E. coli, for example, consists ofa nucleotide sequence 3-9 nucleotides in length located 3-11 nucleotidesupstream of the initiation codon (Shine and Dalgarno, Nature (1975) 254:34; Steitz, In Biological regulation and development: Gene expression(ed. R.F. Goldberger), vol. 1, p. 349, 1979, Plenum Publishing, NY).

For expression of the endoglycoceramidase proteins in yeast, convenientpromoters include GAL1-10 (Johnson and Davies (1984) Mol. Cell. Biol.4:1440-1448) ADH2 (Russell et al. (1983) J. Biol. Chem. 258:2674-2682),PHO5 (EMBO J. (1982) 6:675-680), and MFα (Herskowitz and Oshima (1982)in The Molecular Biology of the Yeast Saccharomyces (eds. Strathern,Jones, and Broach) Cold Spring Harbor Lab., Cold Spring Harbor, N.Y.,pp. 181-209). Additional suitable promoters for use in yeast include theADH2/GAPDH hybrid promoter as described in Cousens et al., Gene61:265-275 (1987) and the AOX1 promoter for use in Pichia strains. Forfilamentous fungi such as, for example, strains of the fungi Aspergillus(McKnight et al., U.S. Pat. No. 4,935,349), examples of useful promotersinclude those derived from Aspergillus nidulans glycolytic genes, suchas the ADH3 promoter (McKnight et al., EMBO J. 4: 2093 2099 (1985)) andthe tpiA promoter. Yeast selectable markers include ADE2, HIS4, LEU2,TRP1, and ALG7, which confers resistance to tunicamycin; the neomycinphosphotransferase gene, which confers resistance to G418; and the CUP1gene, which allows yeast to grow in the presence of copper ions. Anexample of a suitable terminator is the ADH3 terminator (McKnight etal.). Recombinant protein expression in yeast host cells is well knownin the art. See, for example, Pichia Protocols, Higgins and Cregg, eds.,1998, Humana Press; Foreign Gene Expression in Fission Yeast:Schizosaccharomyces Pombe, Giga-Hama and Kumagai eds., 1997, SpringerVerlag. Expression of heterologous proteins in Pichia strains of yeast(including Pichia pastoris, Pichia methanolica, and Pichia ciferrii) isalso described in U.S. Pat. Nos. 6,638,735; 6,258,559; 6,194,196;6,001,597; and 5,707,828, the disclosures of which are herebyincorporated herein by reference in their entirety for all purposes.

Either constitutive or regulated promoters can be used in the presentinvention. Regulated promoters can be advantageous because the hostcells can be grown to high densities before expression of theendoglycoceramidase proteins is induced. High level expression ofheterologous proteins slows cell growth in some situations. An induciblepromoter is a promoter that directs expression of a gene where the levelof expression is alterable by environmental or developmental factorssuch as, for example, temperature, pH, anaerobic or aerobic conditions,light, transcription factors and chemicals. Such promoters are referredto herein as “inducible” promoters, which allow one to control thetiming of expression of the endoglycoceramidase proteins. For E. coliand other bacterial host cells, inducible promoters are known to thoseof skill in the art. These include, for example, the lac promoter, thebacteriophage lambda P_(L) promoter, the hybrid trp-lac promoter (Amannet al. (1983) Gene 25: 167; de Boer et al. (1983) Proc. Nat'l. Acad.Sci. USA 80: 21), and the bacteriophage T7 promoter (Studier et al.(1986) J. Mol. Biol.; Tabor et al. (1985) Proc. Nat'l. Acad. Sci. USA82: 1074-8). These promoters and their use are discussed in Sambrook etal., supra. One preferred inducible promoter for expression inprokaryotes is a dual promoter that includes a tac promoter componentlinked to a promoter component obtained from a gene or genes that encodeenzymes involved in galactose metabolism (e.g., a promoter from aUDPgalactose 4-epimerase gene (galE)). The dual tac-gal promoter, whichis described in PCT Patent Application Publ. No. WO98/20111.

The particular expression vector used to transport the geneticinformation into the cell is not particularly critical. Any of theconventional vectors used for expression in eukaryotic or prokaryoticcells may be used. Standard bacterial expression vectors includeplasmids such as pBR322 based plasmids, pSKF, pUC based plasmids, pETbased plasmids (i.e., pET23D, pET28A, commercially available fromNovagen/EMD Biosciences) and fusion expression systems such as GST andLacZ. Epitope tags can also be added to recombinant proteins to provideconvenient methods of isolation, e.g., c-myc. In yeast, vectors includeYeast Integrating plasmids (e.g., YIp5) and Yeast Replicating plasmids(the YRp series plasmids) and pGPD-2.

Expression vectors containing regulatory elements from eukaryoticviruses are typically used in eukaryotic expression vectors, e.g., SV40vectors, papilloma virus vectors, and vectors derived from Epstein-Barrvirus. Other exemplary eukaryotic vectors include pMSG, pAV009/A⁺,pMTO10/A⁺, pMAMneo-5, baculovirus pDSVE, and any other vector allowingexpression of proteins under the direction of the SV40 early promoter,SV40 later promoter, metallothionein promoter, murine mammary tumorvirus promoter, Rous sarcoma virus promoter, polyhedrin promoter, orother promoters shown effective for expression in eukaryotic cells.Expression in mammalian cells can be achieved using a variety ofcommonly available plasmids, including pSV2, pBC12BI, and p91023, aswell as lytic virus vectors (e.g., vaccinia virus, adeno virus, andbaculovirus), episomal virus vectors (e.g., bovine papillomavirus), andretroviral vectors (e.g., murine retroviruses). Mammalian host cellssuitable for expression of heterologous polypeptides include, forexample, Chinese Hamster Ovary (CHO) cells, human epithial kidney(HEK)293 cells, and NIH 3T3 cells. Expression of heterologouspolypeptides in mammalian expression systems is reviewed in Makrides,Gene Transfer and Expression in Mammalian Cells: New ComprehensiveBiochemistry, 2003, Elsevier Science Ltd.

Some expression systems have markers that provide gene amplificationsuch as thymidine kinase, hygromycin B phosphotransferase, anddihydrofolate reductase. Alternatively, high yield expression systemsnot involving gene amplification are also suitable, such as abaculovirus vector in insect cells, with a polynucleotide sequenceencoding the mutant endoglycoceramidase under the direction of thepolyhedrin promoter or other strong baculovirus promoters.

The elements that are typically included in expression vectors alsoinclude a replicon that functions in E. coli, a gene encoding antibioticresistance to permit selection of bacteria that harbor recombinantplasmids, and unique restriction sites in nonessential regions of theplasmid to allow insertion of eukaryotic sequences. The particularantibiotic resistance gene chosen is not critical, any of the manyresistance genes known in the art are suitable. The prokaryoticsequences are optionally chosen such that they do not interfere with thereplication of the DNA in eukaryotic cells, if necessary.

Translational coupling may be used to enhance expression. The strategyuses a short upstream open reading frame derived from a highly expressedgene native to the translational system, which is placed downstream ofthe promoter, and a ribosome binding site followed after a few aminoacid codons by a termination codon. Just prior to the termination codonis a second ribosome binding site, and following the termination codonis a start codon for the initiation of translation. The system dissolvessecondary structure in the RNA, allowing for the efficient initiation oftranslation. See Squires, et. al. (1988), J. Biol. Chem. 263:16297-16302.

The endoglycoceramidase polypeptides can be expressed intracellularly,or can be secreted from the cell. Intracellular expression often resultsin high yields. If necessary, the amount of soluble, active fusionprotein may be increased by performing refolding procedures (see, e.g.,Sambrook et al., supra.; Marston et al., Bio/Technology (1984) 2: 800;Schoner et al., Bio/Technology (1985) 3: 151). In embodiments in whichthe endoglycoceramidase polypeptides are secreted from the cell, eitherinto the periplasm or into the extracellular medium, the DNA sequence islinked to a cleavable signal peptide sequence. The signal sequencedirects translocation of the fusion protein through the cell membrane.An example of a suitable vector for use in E. coli that contains apromoter-signal sequence unit is pTA1529, which has the E. coli phoApromoter and signal sequence (see, e.g., Sambrook et al., supra.; Oka etal., Proc. Natl. Acad. Sci. USA (1985) 82: 7212; Talmadge et al., Proc.Natl. Acad. Sci. USA (1980) 77: 3988; Takahara et al., J. Biol. Chem.(1985) 260: 2670). In another embodiment, the fusion proteins are fusedto a subsequence of protein A or bovine serum albumin (BSA), forexample, to facilitate purification, secretion, or stability.

The endoglycoceramidase polypeptides of the invention can also befurther linked to other bacterial proteins. This approach often resultsin high yields, because normal prokaryotic control sequences directtranscription and translation. In E. coli, lacZ fusions are often usedto express heterologous polypeptides. Suitable vectors are readilyavailable, such as the pUR, pEX, and pMR100 series (see, e.g., Sambrooket al., supra.). For certain applications, it may be desirable to cleavethe non-endoglycoceramidase from the fusion protein after purification.This can be accomplished by any of several methods known in the art,including cleavage by cyanogen bromide, a protease, or by Factor X_(a)(see, e.g., Sambrook et al., supra.; Itakura et al., Science (1977) 198:1056; Goeddel et al., Proc. Natl. Acad. Sci. USA (1979) 76: 106; Nagaiet al., Nature (1984) 309: 810; Sung et al., Proc. Natl. Acad. Sci. USA(1986) 83: 561). Cleavage sites can be engineered into the gene for thefusion protein at the desired point of cleavage. The present inventionfurther encompasses vectors comprising fusion proteins comprising themutant endoglycoceramidases.

More than one recombinant protein may be expressed in a single host cellby placing multiple transcriptional cassettes in a single expressionvector, or by utilizing different selectable markers for each of theexpression vectors which are employed in the cloning strategy.

A suitable system for obtaining recombinant proteins from E. coli whichmaintains the integrity of their N-termini has been described by Milleret al. Biotechnology 7:698-704 (1989). In this system, the gene ofinterest is produced as a C-terminal fusion to the first 76 residues ofthe yeast ubiquitin gene containing a peptidase cleavage site. Cleavageat the junction of the two moieties results in production of a proteinhaving an intact authentic N-terminal reside.

As discussed above, a person skilled in the art will recognize thatvarious conservative substitutions can be made to any wild-type ormutant endoglycoceramidase or its coding sequence while still retainingthe synthetic activity of the endoglycoceramidase. Moreover,modifications of a polynucleotide coding sequence may also be made toaccommodate preferred codon usage in a particular expression hostwithout altering the resulting amino acid sequence.

When recombinantly over-expressed in bacteria, wild-type and mutantendoglycoceramidases can form insoluble protein aggregates; significantamounts of the recombinant protein will reside in the insoluble fractionduring subsequent purification procedures. Expression of recombinantendoglycoceramidases in insoluble inclusion bodies can be minimized byusing one or more of several strategies known to those in the art,including for example, expressing from an inducible promoter (e.g., lac,T7), adding low concentrations of inducer (e.g., IPTG), using bacterialexpression strains that suppress uninduced protein expression (e.g.,BL21 pLysS), using a bacterial expression strain with a heightenedsensitivity to the concentration of inducer (e.g., Tuner™ host cellsfrom Novagen/EMD Biosciences, San Diego, Calif.), using a bacterialexpression strain that favors disulfide formation of expressedrecombinant proteins (e.g., Origami™ host cells from Novagen), usingminimal media (e.g., M9), varying induction temperatures (e.g., 16-37°C.), adding a signal sequence to direct secretion into the periplasm(e.g., pelB).

Transfection Methods

Standard transfection methods are used to produce bacterial, mammalian,yeast or insect cell lines that express large quantities of thewild-type or mutant endoglycoceramidase, which are then purified usingstandard techniques (see, e.g., Colley et al., J. Biol. Chem. 264:17619-17622 (1989); Guide to Protein Purification, in Methods inEnzymology, vol. 182 (Deutscher, ed., 1990)). Transformation ofeukaryotic and prokaryotic cells are performed according to standardtechniques (see, e.g., Morrison, J. Bact. 132: 349-351 (1977);Clark-Curtiss & Curtiss, Methods in Enzymology 101: 347-362 (Wu et al.,eds, 1983).

Any of the well known procedures for introducing foreign nucleotidesequences into host cells may be used. These include the use of calciumphosphate transfection, polybrene, protoplast fusion, electroporation,liposomes, microinjection, plasma vectors, viral vectors and any of theother well known methods for introducing cloned genomic DNA, cDNA,synthetic DNA, or other foreign genetic material into a host cell (see,e.g., Sambrook and Russell, supra). It is only necessary that theparticular genetic engineering procedure used be capable of successfullyintroducing at least one gene into the host cell capable of expressingthe wild-type or mutant endoglycoceramidase.

Detection of the Expression of Recombinant Endoglycoceramidases

After the expression vector is introduced into appropriate host cells,the transfected cells are cultured under conditions favoring expressionof the wild-type or mutant endoglycoceramidase. The cells are thenscreened for the expression of the recombinant polypeptide, which issubsequently recovered from the culture using standard techniques (see,e.g., Scopes, Protein Purification: Principles and Practice (1982); U.S.Pat. No. 4,673,641; Ausubel et al., supra; and Sambrook and Russell,supra).

Several general methods for screening gene expression are well knownamong those skilled in the art. First, gene expression can be detectedat the nucleic acid level. A variety of methods of specific DNA and RNAmeasurement using nucleic acid hybridization techniques are commonlyused (e.g., Sambrook and Russell, supra). Some methods involve anelectrophoretic separation (e.g., Southern blot for detecting DNA andNorthern blot for detecting RNA), but detection of DNA or RNA can becarried out without electrophoresis as well (such as by dot blot). Thepresence of nucleic acid encoding an endoglycoceramidase in transfectedcells can also be detected by PCR or RT-PCR using sequence-specificprimers.

Second, gene expression can be detected at the polypeptide level.Various immunological assays are routinely used by those skilled in theart to measure the level of a gene product, particularly usingpolyclonal or monoclonal antibodies that react specifically with awild-type or mutant endoglycoceramidase of the present invention, suchas a polypeptide having the amino acid sequence of SEQ ID NOs:29-33,(e.g., Harlow and Lane, Using Antibodies: A Laboratory Manual, ColdSpring Harbor, 1998; Harlow and Lane, Antibodies, A Laboratory Manual,Chapter 14, Cold Spring Harbor, 1988; Kohler and Milstein, Nature, 256:495-497 (1975)). Such techniques require antibody preparation byselecting antibodies with high specificity against the recombinantpolypeptide or an antigenic portion thereof. The methods of raisingpolyclonal and monoclonal antibodies are well established and theirdescriptions can be found in the literature, see, e.g., Harlow and Lane,supra; Kohler and Milstein, Eur. J. Immunol., 6: 511-519 (1976). Moredetailed descriptions of preparing antibody against the mutantendoglycoceramidase of the present invention and conductingimmunological assays detecting the mutant endoglycoceramidase areprovided in a later section.

In addition, functional assays may also be performed for the detectionof a recombinant endoglycoceramidase in transfected cells. Assays fordetecting hydrolytic or synthetic activity of the recombinantendoglycoceramidase are generally described in a later section.

Purification of Recombinant Endoglycoceramidases Solubilization

Once the expression of a recombinant endoglycoceramidase in transfectedhost cells is confirmed, the host cells are then cultured in anappropriate scale for the purpose of purifying the recombinant enzyme.

When the endoglycoceramidases of the present invention are producedrecombinantly by transformed bacteria in large amounts, typically afterpromoter induction, although expression can be constitutive, theproteins may form insoluble aggregates. There are several protocols thatare suitable for purification of protein inclusion bodies. For example,purification of aggregate proteins (hereinafter referred to as inclusionbodies) typically involves the extraction, separation and/orpurification of inclusion bodies by disruption of bacterial cells, e.g.,by incubation in a buffer of about 100-150 μg/ml lysozyme and 0.1%Nonidet P40, a non-ionic detergent. The cell suspension can be groundusing a Polytron grinder (Brinkman Instruments, Westbury, N.Y.).Alternatively, the cells can be sonicated on ice. Alternate methods oflysing bacteria are described in Ausubel et al. and Sambrook andRussell, both supra, and will be apparent to those of skill in the art.

The cell suspension is generally centrifuged and the pellet containingthe inclusion bodies resuspended in buffer which does not dissolve butwashes the inclusion bodies, e.g., 20 mM Tris-HCl (pH 7.2), 1 mM EDTA,150 mM NaCl and 2% Triton-X 100, a non-ionic detergent. It may benecessary to repeat the wash step to remove as much cellular debris aspossible. The remaining pellet of inclusion bodies may be resuspended inan appropriate buffer (e.g., 20 mM sodium phosphate, pH 6.8, 150 mMNaCl). Other appropriate buffers will be apparent to those of skill inthe art.

Following the washing step, the inclusion bodies are solubilized by theaddition of a solvent that is both a strong hydrogen acceptor and astrong hydrogen donor (or a combination of solvents each having one ofthese properties). The proteins that formed the inclusion bodies maythen be renatured by dilution or dialysis with a compatible buffer.Suitable solubilization solvents include, but are not limited to, urea(from about 4 M to about 8 M), formamide (at least about 80%,volume/volume basis), guanidine hydrochloride (from about 4 M to about 8M), and detergents including N-laurylsarcosine (sarkosyl),3-(Cyclohexylamino)-1-propanesulfonic acid (CAPS),3-[(3-Cholamidopropyl)dimethylammonio]-1-propanesulfonate (CHAPS), andlauryl maltoside. Some solvents that are capable of solubilizingaggregate-forming proteins, such as SDS (sodium dodecyl sulfate) and 70%formic acid, may be inappropriate for use in this procedure due to thepossibility of irreversible denaturation of the proteins, accompanied bya lack of immunogenicity and/or activity. Although guanidinehydrochloride and similar agents are denaturants, this denaturation isnot irreversible and renaturation may occur upon removal (by dialysis,for example) or dilution of the denaturant, allowing re-formation of theimmunologically and/or biologically active protein of interest. Aftersolubilization, the protein can be separated from other bacterialproteins by standard separation techniques.

Alternatively, it is possible to purify recombinant polypeptides, e.g.,a mutant endoglycoceramidase, from bacterial periplasm. Where therecombinant protein is exported into the periplasm of the bacteria, theperiplasmic fraction of the bacteria can be isolated by cold osmoticshock in addition to other methods known to those of skill in the art(see e.g., Ausubel et al., supra). To isolate recombinant proteins fromthe periplasm, the bacterial cells are centrifuged to form a pellet. Thepellet is resuspended in a buffer containing 20% sucrose. To lyse thecells, the bacteria are centrifuged and the pellet is resuspended inice-cold 5 mM MgSO₄ and kept in an ice bath for approximately 10minutes. The cell suspension is centrifuged and the supernatant decantedand saved. The recombinant proteins present in the supernatant can beseparated from the host proteins by standard separation techniques wellknown to those of skill in the art. Proteins exported into theperiplasmic space may still form inclusion bodies.

Protein Refolding

Wild-type or mutant endoglycoceramidases purified from inclusion bodiesgenerally must be refolded after solubilization. The presence ofrecombinantly expressed endoglycoceramidases in inclusion bodies can beminimized and subsequent proper refolding maximized by expressing theenzymes in a bacterial strain that favors formation of disulfide bonds(e.g., Origami™ host cells from Novagen/EMD Biosciences). Alternatively,unpaired cysteines, signal peptide sequences can be removed from therecombinant sequences, for instance, using truncation and site-directedmutagenesis techniques. The presence of recombinantly expressed enzymein inclusion bodies also can be minimized by expressing theendoglycoceramidases as a fusion protein with a maltose binding domain(see, for example, Sachdev and Chirgwin, Protein Expr Purif. (1998)1:122-32). Enzyme ultimately purified from inclusion bodies can besolubilized and then subject to refolding buffers containing redoxcouples, for example reduced glutathione/oxidized glutathione(GSH/GSSH), or cysteine/cystamine. Described in, PCT/US05/03856 whichclaims priority to U.S. Provisional Patent Application Nos. 60/542,210;60/599,406; and 60/627,406, the disclosures of each of which are herebyincorporated herein by reference in their entirety or all purposes.Protein refolding kits are commercially available, for example, fromNovagen/EMD Biosciences (see also, Frankel, et al., Proc. Natl. Acad.Sci. USA (1991) 88:1192-1196). Optimization of biochemical variables forproper refolding of a particular endoglycoceramidase, including proteinconcentration, addition of polar additives (e.g., arginine), pH, redoxenvironment potential (the presence of redox couples), ionic strength,and species and concentration of detergent, chaotrope, divalent cations,osmolytes (e.g., polyethylene glycol (PEG)), non-polar additives (e.g.,sugars) can be evaluated using a fractional factorial screen, describedin Armstrong, et al., Protein Science (1999) 8:1475-1483. Kits forcarrying out fractional factorial protein refolding optimization screensare commercially available, for example, from Hampton Research, LagunaNiguel, Calif.).

Purification of Protein Purification Tags

The recombinant fusion protein of the invention can be constructed andexpressed as a fusion protein with a molecular “purification tag” at oneend, which facilitates purification of the protein. Such tags can alsobe used for immobilization of a protein of interest during theglycosylation reaction. Exemplified purification tags include MalE, 6 ormore sequential histidine residues, cellulose binding protein, maltosebinding protein (malE), glutathione S-transferase (GST), lactoferrin,and Sumo fusion protein cleavable sequences (commercially available fromLifeSensors, Malvern, Pa. and EMD Biosciences). Vectors withpurification tag sequences are commercially available from, for example,Novagen/EMD Biosciences. Suitable tags include “epitope tags,” which area protein sequence that is specifically recognized by an antibody.Epitope tags are generally incorporated into fusion proteins to enablethe use of a readily available antibody to unambiguously detect orisolate the fusion protein. A “FLAG tag” is a commonly used epitope tag,specifically recognized by a monoclonal anti-FLAG antibody, consistingof the sequence AspTyrLysAspAspAsp AspLys or a substantially identicalvariant thereof. Other epitope tags that can be used in the inventioninclude, e.g., myc tag, AU1, AU5, DDDDK (EC5), E tag, E2 tag, Glu-Glu, a6 residue histidine peptide, EYMPME, derived from the Polyoma middle Tprotein, HA, HSV, IRS, KT3, S tag, S1 tag, T7 tag, V5 tag, VSV-G,β-galactosidase, Gal4, green fluorescent protein (GFP), luciferase,protein C, protein A, cellulose binding protein, GST (glutathioneS-transferase), a step-tag, Nus-S, PPI-ases, Pfg 27, calmodulin bindingprotein, dsb A and fragments thereof, and granzyme B. Epitope peptidesand antibodies that bind specifically to epitope sequences arecommercially available from, e.g., Covance Research Products, Inc.;Bethyl Laboratories, Inc.; Abcam Ltd.; and Novus Biologicals, Inc.

Other haptens that are suitable for use as tags are known to those ofskill in the art and are described, for example, in the Handbook ofFluorescent Probes and Research Chemicals (6th Ed., Molecular Probes,Inc., Eugene Oreg.). For example, dinitrophenol (DNP), digoxigenin,barbiturates (see, e.g., U.S. Pat. No. 5,414,085), and several types offluorophores are useful as haptens, as are derivatives of thesecompounds. Kits are commercially available for linking haptens and othermoieties to proteins and other molecules. For example, where the haptenincludes a thiol, a heterobifunctional linker such as SMCC can be usedto attach the tag to lysine residues present on the capture reagent.

Standard Protein Separation Techniques for Purification

When a recombinant polypeptide, e.g., the mutant endoglycoceramidase ofthe present invention, is expressed in host cells in a soluble form, itspurification can follow the standard protein purification proceduresknown in the art, including ammonium sulfate precipitation, affinitycolumns, column chromatography, gel electrophoresis and the like (see,generally, R. Scopes, Protein Purification, Springer-Verlag, N.Y.(1982), Deutscher, Methods in Enzymology Vol. 182: Guide to ProteinPurification., Academic Press, Inc. N.Y. (1990)). Substantially purecompositions of at least about 70, 75, 80, 85, 90% homogeneity arepreferred, and 92, 95, 98 to 99% or more homogeneity are most preferred.The purified proteins may also be used, e.g., as immunogens for antibodyproduction.

Solubility Fractionation

Often as an initial step, and if the protein mixture is complex, aninitial salt fractionation can separate many of the unwanted host cellproteins (or proteins derived from the cell culture media) from therecombinant protein of interest, e.g., a mutant endoglycoceramidase ofthe present invention. The preferred salt is ammonium sulfate. Ammoniumsulfate precipitates proteins by effectively reducing the amount ofwater in the protein mixture. Proteins then precipitate on the basis oftheir solubility. The more hydrophobic a protein is, the more likely itis to precipitate at lower ammonium sulfate concentrations. A typicalprotocol is to add saturated ammonium sulfate to a protein solution sothat the resultant ammonium sulfate concentration is between 20-30%.This will precipitate the most hydrophobic proteins. The precipitate isdiscarded (unless the protein of interest is hydrophobic) and ammoniumsulfate is added to the supernatant to a concentration known toprecipitate the protein of interest. The precipitate is then solubilizedin buffer and the excess salt removed if necessary, through eitherdialysis or diafiltration. Other methods that rely on solubility ofproteins, such as cold ethanol precipitation, are well known to those ofskill in the art and can be used to fractionate complex proteinmixtures.

Size Differential Filtration

Based on a calculated molecular weight, a protein of greater and lessersize can be isolated using ultrafiltration through membranes ofdifferent pore sizes (for example, Amicon or Millipore membranes). As afirst step, the protein mixture is ultrafiltered through a membrane witha pore size that has a lower molecular weight cut-off than the molecularweight of a protein of interest, e.g., a mutant endoglycoceramidase. Theretentate of the ultrafiltration is then ultrafiltered against amembrane with a molecular cut off greater than the molecular weight ofthe protein of interest. The recombinant protein will pass through themembrane into the filtrate. The filtrate can then be chromatographed asdescribed below.

Column Chromatography

The proteins of interest (such as the mutant endoglycoceramidase of thepresent invention) can also be separated from other proteins on thebasis of their size, net surface charge, hydrophobicity, or affinity forligands. In addition, antibodies raised against endoglycoceramidase canbe conjugated to column matrices and the endoglycoceramidaseimmunopurified. When the enzymes are expressed as fusion proteins withpurification tags, a column loaded with resin that specifically binds tothe purification tag is used, for example, resin conjugated to nickel,cellulose, maltose, anti-lactoferrin antibodies, or glutathione. All ofthese methods are well known in the art.

It will be apparent to one of skill that chromatographic techniques canbe performed at any scale and using equipment from many differentmanufacturers (e.g., Pharmacia Biotech).

Production of Antibodies against Endoglycoceramidases and Immunoassaysfor Detection of Endoglycoceramidase Expression

To confirm the production of a recombinant endoglycoceramidase,immunological assays may be useful to detect in a sample the expressionof the endoglycoceramidase. Immunological assays are also useful forquantifying the expression level of the recombinant enzyme.

Production of Antibodies Against Endoglycoceramidase

Methods for producing polyclonal and monoclonal antibodies that reactspecifically with an immunogen of interest are known to those of skillin the art (see, e.g., Coligan, Current Protocols in ImmunologyWiley/Greene, N Y, 1991; Harlow and Lane, Antibodies: A LaboratoryManual Cold Spring Harbor Press, N Y, 1989; Stites et al. (eds.) Basicand Clinical Immunology (4th ed.) Lange Medical Publications, Los Altos,Calif., and references cited therein; Goding, Monoclonal Antibodies:Principles and Practice (2d ed.) Academic Press, New York, N.Y., 1986;and Kohler and Milstein Nature 256: 495-497, 1975). Such techniquesinclude antibody preparation by selection of antibodies from librariesof recombinant antibodies in phage or similar vectors (see, Huse et al.,Science 246: 1275-1281, 1989; and Ward et al., Nature 341: 544-546,1989).

In order to produce antisera containing antibodies with desiredspecificity, the polypeptide of interest (e.g., a mutantendoglycoceramidase of the present invention) or an antigenic fragmentthereof can be used to immunize suitable animals, e.g., mice, rabbits,or primates. A standard adjuvant, such as Freund's adjuvant, can be usedin accordance with a standard immunization protocol. Alternatively, asynthetic antigenic peptide derived from that particular polypeptide canbe conjugated to a carrier protein and subsequently used as animmunogen.

The animal's immune response to the immunogen preparation is monitoredby taking test bleeds and determining the titer of reactivity to theantigen of interest. When appropriately high titers of antibody to theantigen are obtained, blood is collected from the animal and antiseraare prepared. Further fractionation of the antisera to enrich antibodiesspecifically reactive to the antigen and purification of the antibodiescan be performed subsequently, see, Harlow and Lane, supra, and thegeneral descriptions of protein purification provided above.

Monoclonal antibodies are obtained using various techniques familiar tothose of skill in the art. Typically, spleen cells from an animalimmunized with a desired antigen are immortalized, commonly by fusionwith a myeloma cell (see, Kohler and Milstein, Eur. J. Immunol.6:511-519, 1976). Alternative methods of immortalization include, e.g.,transformation with Epstein Barr Virus, oncogenes, or retroviruses, orother methods well known in the art. Colonies arising from singleimmortalized cells are screened for production of antibodies of thedesired specificity and affinity for the antigen, and the yield of themonoclonal antibodies produced by such cells may be enhanced by varioustechniques, including injection into the peritoneal cavity of avertebrate host.

Additionally, monoclonal antibodies may also be recombinantly producedupon identification of nucleic acid sequences encoding an antibody withdesired specificity or a binding fragment of such antibody by screeninga human B cell cDNA library according to the general protocol outlinedby Huse et al., supra. The general principles and methods of recombinantpolypeptide production discussed above are applicable for antibodyproduction by recombinant methods.

When necessary, antibodies capable of specifically recognizing a mutantendoglycoceramidase of the present invention can be tested for theircross-reactivity against the corresponding wild-type endoglycoceramidaseand thus distinguished from the antibodies against the wild-type enzyme.For instance, antisera obtained from an animal immunized with a mutantendoglycoceramidase can be run through a column on which a correspondingwild-type endoglycoceramidase is immobilized. The portion of theantisera that passes through the column recognizes only the mutantendoglycoceramidase and not the corresponding wild-typeendoglycoceramidase. Similarly, monoclonal antibodies against a mutantendoglycoceramidase can also be screened for their exclusivity inrecognizing only the mutant but not the wild-type endoglycoceramidase.

Polyclonal or monoclonal antibodies that specifically recognize only themutant endoglycoceramidase of the present invention but not thecorresponding wild-type endoglycoceramidase are useful for isolating themutant enzyme from the wild-type endoglycoceramidase, for example, byincubating a sample with a mutant endoglycoceramidase-specificpolyclonal or monoclonal antibody immobilized on a solid support.

Immunoassays for Detecting Endoglycoceramidase Expression

Once antibodies specific for an endoglycoceramidase of the presentinvention are available, the amount of the polypeptide in a sample,e.g., a cell lysate, can be measured by a variety of immunoassay methodsproviding qualitative and quantitative results to a skilled artisan. Fora review of immunological and immunoassay procedures in general see,e.g., Stites, supra; U.S. Pat. Nos. 4,366,241; 4,376,110; 4,517,288; and4,837,168.

Labeling in Immunoassays

Immunoassays often utilize a labeling agent to specifically bind to andlabel the binding complex formed by the antibody and the target protein.The labeling agent may itself be one of the moieties comprising theantibody/target protein complex, or may be a third moiety, such asanother antibody, that specifically binds to the antibody/target proteincomplex. A label may be detectable by spectroscopic, photochemical,biochemical, immunochemical, electrical, optical or chemical means.Examples include, but are not limited to, magnetic beads (e.g.,Dynabeads™), fluorescent dyes (e.g., fluorescein isothiocyanate, Texasred, rhodamine, and the like), radiolabels (e.g., ³H, ¹²⁵I, ³⁵S, ¹⁴C, or³²p), enzymes (e.g., horse radish peroxidase, alkaline phosphatase, andothers commonly used in an ELISA), and colorimetric labels such ascolloidal gold or colored glass or plastic (e.g., polystyrene,polypropylene, latex, etc.) beads.

In some cases, the labeling agent is a second antibody bearing adetectable label. Alternatively, the second antibody may lack a label,but it may, in turn, be bound by a labeled third antibody specific toantibodies of the species from which the second antibody is derived. Thesecond antibody can be modified with a detectable moiety, such asbiotin, to which a third labeled molecule can specifically bind, such asenzyme-labeled streptavidin.

Other proteins capable of specifically binding immunoglobulin constantregions, such as protein A or protein G, can also be used as the labelagents. These proteins are normal constituents of the cell walls ofstreptococcal bacteria. They exhibit a strong non-immunogenic reactivitywith immunoglobulin constant regions from a variety of species (see,generally, Kronval, et al. J. Immunol., 111: 1401-1406 (1973); andAkerstrom, et al., J. Immunol., 135: 2589-2542 (1985)).

Immunoassay Formats

Immunoassays for detecting a target protein of interest (e.g., arecombinant endoglycoceramidase) from samples may be either competitiveor noncompetitive. Noncompetitive immunoassays are assays in which theamount of captured target protein is directly measured. In one preferred“sandwich” assay, for example, the antibody specific for the targetprotein can be bound directly to a solid substrate where the antibody isimmobilized. It then captures the target protein in test samples. Theantibody/target protein complex thus immobilized is then bound by alabeling agent, such as a second or third antibody bearing a label, asdescribed above.

In competitive assays, the amount of target protein in a sample ismeasured indirectly by measuring the amount of an added (exogenous)target protein displaced (or competed away) from an antibody specificfor the target protein by the target protein present in the sample. In atypical example of such an assay, the antibody is immobilized and theexogenous target protein is labeled. Since the amount of the exogenoustarget protein bound to the antibody is inversely proportional to theconcentration of the target protein present in the sample, the targetprotein level in the sample can thus be determined based on the amountof exogenous target protein bound to the antibody and thus immobilized.

In some cases, western blot (immunoblot) analysis is used to detect andquantify the presence of a wild-type or mutant endoglycoceramidase inthe samples. The technique generally comprises separating sampleproteins by gel electrophoresis on the basis of molecular weight,transferring the separated proteins to a suitable solid support (such asa nitrocellulose filter, a nylon filter, or a derivatized nylon filter)and incubating the samples with the antibodies that specifically bindthe target protein. These antibodies may be directly labeled oralternatively may be subsequently detected using labeled antibodies(e.g., labeled sheep anti-mouse antibodies) that specifically bind tothe antibodies against the endoglycoceramidase.

Other assay formats include liposome immunoassays (LIA), which useliposomes designed to bind specific molecules (e.g., antibodies) andrelease encapsulated reagents or markers. The released chemicals arethen detected according to standard techniques (see, Monroe et al.,Amer. Clin. Prod. Rev., 5: 34-41 (1986)).

Methods for Synthesizing a Glycolipid Using Mutant Endoglycoceramidases

The invention also provides a method of synthesizing a glycolipid oraglycone. The method includes contacting a glycosyl donor comprising aglycosyl group, and an aglycone with a mutant endoglycoceramidase of theinvention under conditions appropriate to transfer said glycosyl groupto said aglycone.

In one aspect, the invention provides a method of synthesizing aglycolipid or aglycone, the method comprising, contacting a donorsubstrate comprising a saccharide moiety and an acceptor substrate witha mutant endoglycoceramidase having a modified nucleophilic carboxylate(i.e., Glu or Asp) residue, wherein the nucleophilic Glu/Asp resideswithin a(Ile/Met/Leu/Phe/Val)-(Leu/Met/Ile/Val)-(Gly/Ser/Thr)-(Glu/Asp)-(Phe/Thr/Met/Leu)-(Gly/Leu/Phe)sequence of a corresponding wild-type endoglycoceramidase, underconditions wherein the endoglycoceramidase catalyzes the transfer of asaccharide moiety from a donor substrate to an acceptor substrate,thereby producing the glycolipid or aglycone.

In a further aspect, the invention provides a method of synthesizing aglycolipid or aglycone, the method comprising, contacting a donorsubstrate comprising a saccharide moiety and an acceptor substrate witha mutant endoglycoceramidase having a modified Glu residue within thesubsequence of Asn-Glu-Pro, wherein the subsequence resides within theacid-base sequence region ofVal-X₁-(Ala/Gly)-(Tyr/Phe)-(Asp/Glu)-(Leu/Ile)-X₂-Asn-Glu-Pro-X₃-X₄-Glysequence in the corresponding wild-type protein, under conditionswherein the endoglycoceramidase catalyzes the transfer of a saccharidemoiety from a donor substrate to an acceptor substrate, therebyproducing the glycolipid or aglycone.

In carrying out the methods of glycolipid synthesis, one or both of thenucleophilic carboxylate amino acid residue (i.e., a Glu or an Asp)and/or acid-base sequence region Glu residues of a correspondingwild-type endoglycoceramidase can be deleted or replaced with anotherchemical moiety that retains the integral structure of the protein suchthat the mutant enzyme has synthetic activity. For example, one or moreof the nucleophilic carboxylate amino acid residues (Glu or Asp) and/oracid-base sequence region Glu residues can be replaced with an L-aminoacid residue other than Glu or Asp, a D-amino acid residue (including aD-Glu or a D-Asp), an unnatural amino acid, an amino acid analog, anamino acid mimetic, and the like. Usually, the one or more carboxylateamino acid residues (Glu or Asp) are substituted with another L-aminoacid other than Glu or Asp, for example, Gly, Ala, Ser, Asp, Asn, Glu,Gln, Cys, Thr, Ile, Leu or Val.

In one embodiment, the mutant enzymes of the invention converts at leastabout 50% of the starting materials, based upon the limiting reagent, toa desired glycolipid, more preferably, at least about 60%, 70%, 80% or90%. In another preferred embodiment, the conversion of the limitingreagent to glycolipid is virtually quantitative, affording a conversionthat is at least about 90%, and more preferably, at least about 92%,94%, 96%, 98% and even more preferably, at least about 99%.

In another exemplary embodiment, the glycosyl donor and the acceptorsubstrate (i.e., aglycone) are present in an approximately 1:1 molarratio and the enzyme of the invention, acting catalytically, convertsthe two reagents to a glycolipid in at least about 50% yield, morepreferably at least about 60%, 70%, or 80%. In a further exemplaryembodiment, the conversion is essentially quantitative as discussedabove.

In one embodiment, the synthesized glycolipid is an aglycone(non-carbohydrate alcohol (OH) or (SH)) conjugated to a non-reducingsugar and a non-glycoside.

Donor Substrates

Donor substrates for wild-type and mutant endoglycoceramidases includeany activated glycosyl derivatives of anomeric configuration oppositethe natural glycosidic linkage. The enzymes of the invention are used tocouple α-modified or β-modified glycosyl donors, usually α-modifiedglycosyl donors, with glycoside acceptors. Preferred donor molecules areglycosyl fluorides, although donors with other groups which arereasonably small and which function as relatively good leaving groupscan also be used. Examples of other glycosyl donor molecules includeglycosyl chlorides, bromides, acetates, mesylates, propionates,pivaloates, and glycosyl molecules modified with substituted phenols.Among the α-modified or β-modified glycosyl donors, α-galactosyl,α-mannosyl, α-glucosyl, α-fucosyl, α-xylosyl, α-sialyl,α-N-acetylglucosaminyl, α-N-acetylgalactosaminyl, β-galactosyl,β-mannosyl, β-glucosyl, β-fucosyl, β-xylosyl, β-sialyl,β-N-acetylglucosaminyl and β-N-acetylgalactosaminyl are most preferred.Additional donor substrates include ganglioside head groups, forexample, those listed in Table 2, below, and those depicted in FIGS.1-13. Accordingly, in one embodiment, the donor substrate can be one ormore ganglioside glycosyl head groups selected from the group consistingof GD_(1a). GD_(1a), GD_(1b), GD₂, GD₃, Gg3, Gg4, GH₁, GH₂, GH₃, GM₁,GM_(1b), GM₂, GM₃, Fuc-GM₁, GP₁, GP₂, GP₃, GQ_(1b), GQ_(1B), GQ_(1β),GQ_(1β), GQ₂, GQ₃, GT_(1a), GT_(1b), GT_(1c), GT_(1β), GT_(1c), GT₂, andGT₃. The donor molecules can be monosaccharides, or may themselvescontain multiple sugar moieties (oligosaccharides). Donor substrates ofuse in the particular methods include those described in U.S. Pat. Nos.6,284,494; 6,204,029; 5,952,203; and 5,716,812.

Glycosyl fluorides can be prepared from the free sugar by firstacetylating the sugar and then treating it with HF/pyridine. This willgenerate the thermodynamically most stable anomer of the protected(acetylated) glycosyl fluoride. If the less stable anomer is desired, itmay be prepared by converting the peracetylated sugar with HBr/HOAc orwith HCL to generate the anomeric bromide or chloride. This intermediateis reacted with a fluoride salt such as silver fluoride to generate theglycosyl fluoride. Acetylated glycosyl fluorides may be deprotected byreaction with mild (catalytic) base in methanol (e.g., NaOMe/MeOH). Inaddition, glycosyl donor molecules, including many glycosyl fluoridescan be purchased commercially. Thus a wide range of donor molecules areavailable for use in the methods of the present invention.

Acceptor Substrates

Suitable acceptor substrates include any aglycone that the mutantendoceramidases can conjugate with a saccharide moiety. For example, themutant endoglycoceramide synthases are capable of synthesizing aglycolipid or aglycone by coupling a saccharide and a heteroalkylsubstrate with a structure as shown in Formula Ia, Formula Ib, FormulaII or Formula III as shown below:

In Formula Ia and Formula Ib, the symbol Z represents OH, SH, orNR⁴R^(4′). R¹ and R² are members independently selected from NHR⁴, SR⁴,OR⁴, OCOR⁴, OC(O)NHR⁴, NHC(O)OR⁴, OS(O)₂OR⁴, C(O)R⁴, NHC(O)R⁴,detectable labels, and targeting moieties. The symbols R³, R⁴ andR^(4′), R⁵, R⁶ and R⁷ each are members independently selected from H,substituted or unsubstituted alkyl, substituted or unsubstitutedheteroalkyl, substituted or unsubstituted aryl, substituted orunsubstituted heteroaryl, substituted or unsubstituted heterocycloalkyl.

In Formula II, Z¹ is a member selected from O, S, and NR⁴; R¹ and R² aremembers independently selected from NHR⁴, SR⁴, OR⁴, OCOR⁴, OC(O)NHR⁴,NHC(O)OR⁴, OS(O)₂OR⁴, C(O)R⁴, NHC(O)R⁴, detectable labels, and targetingmoieties. The symbols R³, R⁴, R⁵, R⁶ and R⁷ each are membersindependently selected from H, substituted or unsubstituted alkyl,substituted or unsubstituted heteroalkyl, substituted or unsubstitutedaryl, substituted or unsubstituted heteroaryl, substituted orunsubstituted heterocycloalkyl. Formula II is representative of certainembodiments wherein the aglycone portion is conjugated to a furthersubstrate component, for example, a leaving group or a solid support.

In certain embodiments, acceptor substrates such as those depicted inTable 1 below are used in the methods of glycolipid or aglyconesynthesis employing the mutant endoglycoceramidases.

TABLE 1 Representative Acceptor Substrates For Glycosynthase SynthesisReactions

A

B

C

D

E

F

G

H

I

J

In certain embodiments, the acceptor substrate is a sphingosine, asphingosine analog or a ceramide. In certain embodiments, the acceptorsubstrate is one or more sphingosine analogs, including those describedin co-pending patent applications PCT/US2004/006904 (which claimspriority to U.S. Provisional Patent Application No. 60/452,796); U.S.patent application Ser. No. 10/487,841; U.S. patent application Ser.Nos. 10/485,892; 10/485,195, and 60/626,678.

In general, the sphingosine analogs described in the above-referencedapplications are those compounds having the formula:

wherein Z is a member selected from O, S, C(R²)₂ and NR²; X is a memberselected from H, —OR³, —NR³R⁴, —SR³, and —CHR³R⁴; R¹, R², R³ and R⁴ aremembers independently selected from H, substituted or unsubstitutedalkyl, substituted or unsubstituted heteroalkyl, substituted orunsubstituted aryl, substituted or unsubstituted heteroaryl, substitutedor unsubstituted heterocycloalkyl, —C(=M)R⁵, —C(=M)-Z¹—R⁵, —SO₂R⁵, and—SO₃; wherein M and Z¹ are members independently selected from O, NR⁶ orS; Y is a member selected from H, —OR⁷, —SR⁷, —NR⁷R⁸, substituted orunsubstituted alkyl, substituted or unsubstituted heteroalkyl,substituted or unsubstituted aryl, substituted or unsubstitutedheteroaryl, and substituted or unsubstituted heterocycloalkyl, whereinR⁵, R⁶, R⁷ and R⁸ are members independently selected from H, substitutedor unsubstituted alkyl, substituted or unsubstituted heteroalkyl,substituted or unsubstituted aryl, substituted or unsubstitutedheteroaryl, substituted or unsubstituted heterocycloalkyl; and R^(a),R^(b), R^(C) and R^(d) are each independently H, substituted orunsubstituted alkyl, substituted or unsubstituted heteroalkyl,substituted or unsubstituted aryl, substituted or unsubstitutedheteroaryl, substituted or unsubstituted heterocycloalkyl.

In certain embodiments, the acceptor substrate can be one or moresphingosine analogs including D-erythro-sphingosine,D-erythro-sphinganine, L-threo-sphingosine, L-threo-dihydrosphingosine,D-erythro-phytosphingosine, or N-ocatanoyl-D-erythro-sphingosine.

Production of Glycolipids

Wild-type and mutant endoglycoceramidase polypeptides can be used tomake glycolipid products in in vitro reactions mixes or by in vivoreactions, e.g., by fermentative growth of recombinant microorganismsthat comprise nucleotides that encode endoglycoceramidase polypeptides.

A. In Vitro Reactions

The wild-type and mutant endoglycoceramidase polypeptides can be used tomake sialylated products in in vitro reactions mixes. The in vitroreaction mixtures can include permeabilized microorganisms comprisingthe wild-type or mutant endoglycoceramidase polypeptides, partiallypurified endoglycoceramidase polypeptides, or purifiedendoglycoceramidase polypeptides; as well as donor substrates, acceptorsubstrates, and appropriate reaction buffers. For in vitro reactions,the recombinant wild-type or mutant endoglycoceramidase proteins,acceptor substrates, donor substrates and other reaction mixtureingredients are combined by admixture in an aqueous reaction medium.Additional glycosyltransferases can be used in combination with theendoglycoceramidase polypeptides, depending on the desired glycolipidend product. The medium generally has a pH value of about 5 to about8.5. The selection of a medium is based on the ability of the medium tomaintain pH value at the desired level. Thus, in some embodiments, themedium is buffered to a pH value of about 7.5. If a buffer is not used,the pH of the medium should be maintained at about 5 to 8.5, dependingupon the particular endoglycoceramidase and other enzymes used.

Enzyme amounts or concentrations are expressed in activity units, whichis a measure of the initial rate of catalysis. One activity unitcatalyzes the formation of 1 μmol of product per minute at a giventemperature (typically 37° C.) and pH value (typically 7.5). Thus, 10units of an enzyme is a catalytic amount of that enzyme where 10 μmol ofsubstrate are converted to 10 μmol of product in one minute at atemperature of 37° C. and a pH value of 7.5.

The reaction mixture may include divalent metal cations (Mg²⁺, Mn²⁺).The reaction medium may also comprise solubilizing detergents (e.g.,Triton or SDS) and organic solvents such as methanol or ethanol, ifnecessary. The enzymes can be utilized free in solution or can be boundto a support such as a polymer. The reaction mixture is thussubstantially homogeneous at the beginning, although some precipitatecan form during the reaction.

The temperature at which an above process is carried out can range fromjust above freezing to the temperature at which the most sensitiveenzyme denatures. That temperature range is preferably about 0° C. toabout 45° C., and more preferably at about 20° C. to about 37° C.

The reaction mixture so formed is maintained for a period of timesufficient to obtain the desired high yield of desired glycolipiddeterminants. For large-scale preparations, the reaction will often beallowed to proceed for between about 0.5-240 hours, and more typicallybetween about 1-18 hours.

Preferably, the concentrations of activating donor substrates andenzymes are selected such that glycosylation proceeds until the acceptorsubstrate is consumed.

Each of the enzymes is present in a catalytic amount. The catalyticamount of a particular enzyme varies according to the concentration ofthat enzyme's substrate as well as to reaction conditions such astemperature, time and pH value. Means for determining the catalyticamount for a given enzyme under preselected substrate concentrations andreaction conditions are well known to those of skill in the art.

B. In Vivo Reactions

The mutant endoglycoceramidase polypeptides can be used to makeglycolipid products by in vivo reactions, e.g., fermentative growth ofrecombinant microorganisms comprising the endoglycoceramidasepolypeptides. Fermentative growth of recombinant microorganisms canoccur in the presence of medium that includes an acceptor substrate anda donor substrate or a precursor to a donor substrate. See, e.g., Priemet al., Glycobiology 12:235-240 (2002). The microorganism takes up theacceptor substrate and the donor substrate or the precursor to a donorsubstrate and the addition of the donor substrate to the acceptorsubstrate takes place in the living cell. The microorganism can bealtered to facilitate uptake of the acceptor substrate, e.g., byexpressing a sugar transport protein.

For glycosyltransferase cycles carried out in vitro, the concentrationsor amounts of the various reactants used in the processes depend uponnumerous factors including reaction conditions such as temperature andpH value, and the choice and amount of acceptor saccharides to beglycosylated. Because the glycosylation process permits regeneration ofactivating nucleotides, activated donor sugars and scavenging ofproduced PPi in the presence of catalytic amounts of the enzymes, theprocess is limited by the concentrations or amounts of thestoichiometric substrates discussed before. The upper limit for theconcentrations of reactants that can be used in accordance with themethod of the present invention is determined by the solubility of suchreactants.

Functional Assays for the Endoglycoceramidases

In addition to immunological assays, enzymatic assays can be used fordetecting the presence and/or activity of the endoglycoceramidase of thepresent invention. These enzymatic assays are useful to establish thedistinct functional characteristics of the wild-type and mutantendoglycoceramidases of the present invention. The production ofglycolipid end products can be monitored by e.g., determining thatproduction of the desired product has occurred or by determining that asubstrate such as the acceptor substrate has been depleted. Those ofskill will recognize that glycolipid end products including gangliosidesor glycosphingolipid analogs can be identified using techniques such aschromatography, e.g., using paper or TLC plates, or by massspectrometry, e.g., MALDI-TOF spectrometry, or by NMR spectroscopy.

Assays for Hydrolytic Activity

To test the hydrolytic activity of an endoglycoceramidase, either thewild-type or a modified version of the enzyme, a glycolipid can be usedas a substrate. Upon incubation of the substrate (e.g., lyso-GM₂, GM₂,or GM₃) with the endoglycoceramidase under appropriate conditions,assays are performed to detect the presence of hydrolytic products suchas an oligosaccharide and an aglycone (e.g., C-18 ceramide), whichindicates that the endoglycoceramidase is hydrolytically active. Tofacilitate the detection of hydrolytic products, the substrate for ahydrolytic assay may be labeled with a detectably moiety, for instance,a fluorescent or radioactive label. Sugars which release a fluorescentor chromophoric group on hydrolysis (i.e., dinitrophenyl, p-nitrophenyl,or methylumbelliferyl glycosides) can also be used to test forhydrolytic activity. A preferred assay format for detecting hydrolyticproducts includes various chromatographic methods, such as thin-layerchromatography (TLC).

An appropriate control is preferably included in each hydrolyticactivity assay such that the activity level of a mutantendoglycoceramidase can be assessed in comparison with that of awild-type endoglycoceramidase.

Assays for Synthetic Activity

To test the synthetic activity of an endoglycoceramidase, particularly amutant endoglycoceramidase (or an “endoglycoceramide synthase”), anoligosaccharide and a heteroalkyl substrate, e.g., of Formula I andFormula II, can be used as substrates. Upon incubation of the twosubstrates with the “endoglycoceramide synthase” under appropriateconditions, assays are performed to detect the presence of glycolipidformed by reaction between the oligosaccharide and the heteroalkylsubstrate, e.g., an aglycone including a ceramide or a sphingosine,which indicates that the “endoglycoceramide synthase” is syntheticallyactive. To facilitate the detection of the synthetic process, at leastone of the two substrates for the synthetic assay may be labeled with adetectably moiety, for instance, a fluorescent or radioactive label. Thesame assay format, such as TLC, for detecting hydrolytic products can beused for detecting synthetic products.

An appropriate control is preferably included in each assay such thatthe activity level of an endoglycoceramide synthase can be assessed incomparison with that of a wild-type endoglycoceramidase.

Synthesis of Glycolipids Using Mutant Endoglycoceramide Synthases

Upon identifying a mutant endoglycoceramidase that is syntheticallyactive, this enzyme can be used for production of a large variety ofglycolipids based on different combinations of heteroalkyl substrates.End products of particular interest are glycosylated aglycones,including glycosylated sphingosines, glycosylated sphingosine analogs,and glycosylated ceramides (i.e., cerebrosides and gangliosides). Themethods of the invention are useful for producing any of a large numberof gangliosides and related structures. Many gangliosides of interestare described in Oettgen, H.F., ed., Gangliosides and Cancer, VCH,Germany, 1989, pp. 10-15, and references cited therein. The end productcan be a glycosylsphingosine, a glycosphingolipid, a cerebroside or aganglioside. Exemplified ganglioside end products include those listedin Table 2, below. Accordingly, in one embodiment, the synthesizedglycolipid can be one or more of GD_(1a). GD_(1α), GD_(1b), GD₂, GD₃,Gg3, Gg4, GH₁, GH₂, GH₃, GM₁, GM_(1b), GM₂, GM₃, Fuc-GM₁, GP₁, GP₂, GP₃,GQ_(1b), GQ_(1B), GQ_(1β), GQ_(1c), GQ₂, GQ₃, GT_(1a), GT_(1b), GT_(1c),GT_(1β), GT_(1c), GT₂, GT₃, or polysialylated lactose.

TABLE 2 Exemplified Ganglioside Formulas and Abbreviations StructureAbbreviation Neu5Ac3Gal4GlcCer GM3 GalNAc4(Neu5Ac3)Gal4GlcCer GM2Gal3GalNAc4(Neu5Ac3)Gal4GlcCer GM1a Neu5Ac3Gal3GalNAc4Gal4GlcCer GM1bNeu5Ac8Neu5Ac3Gal4GlcCer GD3 GalNAc4(Neu5Ac8Neu5Ac3)Gal4GlcCer GD2Neu5Ac3Gal3GalNAc4(Neu5Ac3)Gal4GlcCer GD1aNeu5Ac3Gal3(Neu5Ac6)GalNAc4Gal4GlcCer GD1αGal3GalNAc4(Neu5Ac8Neu5Ac3)Gal4GlcCer GD1bNeu5Ac8Neu5Ac3Gal3GalNAc4(Neu5Ac3)Gal4GlcCer GT1aNeu5Ac3Gal3GalNAc4(Neu5Ac8Neu5Ac3)Gal4GlcCer GT1bGal3GalNAc4(Neu5Ac8Neu5Ac8Neu5Ac3)Gal4GlcCer GT1cNeu5Ac8Neu5Ac3Gal3GalNAc4(Neu5Ac8Neu5c3)Gal4GlcCer GQ1b Nomenclature ofGlycolipids, IUPAC-IUB Joint Commission on Biochemical Nomenclature(Recommendations 1997); Pure Appl. Chem. (1997) 69: 2475-2487; Eur. J.Biochem (1998) 257: 293-298) (see, the worldwide web atchem.qmw.ac.uk/iupac/misc/glylp.html).

Exemplified end products further include those depicted in FIGS. 1-13.Additional end product glycolipids that can be produced using the mutantendoglycoceramidases of the present invention include theglycosphingolipids, glycosylsphingosines and ganglioside derivativesdisclosed in co-pending patent applications PCT/US2004/006904 (whichclaims priority to U.S. Provisional Patent Application No. 60/452,796);U.S. patent application Ser. No. 10/487,841; U.S. patent applicationSer. Nos. 10/485,892; 10/485,195, and 60/626,678.

Further modifications can be made to the glycolipids synthesized usingthe endoglycoceramide synthase of the present invention. Exemplarymethods of further elaborating glycolipids produced using the presentinvention are set forth in WO 03/017949; PCT/US02/24574; US2004063911(although each is broadly directed to modification of peptides withglycosyl moieties, the methods disclosed therein are equally applicableto the glycolipids and method of producing them set forth herein).Moreover, the glycolipid compositions of the invention can be subjectedto glycoconjugation as disclosed in WO 03/031464 and its progeny(although each is broadly directed to modification of peptides withglycosyl moieties, the methods disclosed therein are equally applicableto the glycolipids and method of producing them set forth herein).

EXAMPLES

The following examples are provided by way of illustration only and notby way of limitation. Those of skill in the art will readily recognize avariety of non-critical parameters that could be changed or modified toyield essentially similar results.

Example I: Generating Mutant Endoglycoceramidases

A synthetic endoglycoceramidase gene was produced by Blue HeronBiotechnology (EGCase1395). Subsequently the gene was subcloned into apT7-7 expression vector (FIG. 14). Mutations at one of the nucleotidesencoding Glu233 of endoglycoceramidase derived from Rhodococcus sp.M-777 (GenBank Accession No. AAB67050, SEQ ID NO:3), were introducedinto the EGCase gene by a PCR-based method using five primer sets bycombining the same 5′ primer with five different 3′ primers:

The 5′ primer: 5′Copt (SEQ ID NO: 34) AATTCGATTGGATCCCATATGAGCGGAAGCGThe 3′ primers: 3′ Asp PstI (SEQ ID NO: 35)TCGATTCTGCAGGGAGCCACCAAACGGGTCATTCATCAG 3′Gln PstI (SEQ ID NO: 36)TCGATTCTGCAGGGAGCCACCAAACGGCTGATTCATCAG 3′Ala PstI-11-1 (SEQ ID NO: 37)CGGTCCCTGCAGGGAGCCACCAAACGGCGCATTCATCAG 3′Gly PstI-11-1 (SEQ ID NO: 38)CGGTCCCTGCAGGGAGCCACCAAACGGCCCATTCATCAG 3′Ser PstI-11-1 (SEQ ID NO: 39)CGGTCCCTGCAGGGAGCCACCAAACGGCGAATTCATCAG

The PCR program used for generating mutations was essentially asfollows: the template and primers were first incubated at 95° C. for 5minutes, Vent DNA polymerase (New England Biolabs) was then added, whichwas followed by 30 cycles of amplification: 94° C. for 1 minute, 55° C.for 1 minute, and 72° C. for 2 minutes.

PCR products were digested with NdeI and PstI, and pT7-7 vector wasdigested with NdeI, EcoRI, and PstI. Following purification of thedigestion products from a 0.8% TAE agarose gel, the PCR products weresubcloned into pT7-7 vector via a ligation reaction. Upon completion ofthe ligation reaction, the ligation product was electroporated intoBL21DE3 LacZ-cells, which were prepared from BL21DE3 cells (WilliamStudier, Brookhaven National Laboratories, Upton, N.Y.) by disruptingthe LacZ gene with a tetracycline or kanamycin resistance gene(generated at Neose Technologies, Inc.). Colonies were screened for PCRproduct insert. All EGCase mutants were confirmed by sequencing.

Example II: Hydrolytic Assays

An exemplary hydrolytic reaction had a volume of 50 μL, containing 20 μgof substrate (pre-dried lyso-GM2, GM2, or GM3, generated at NeoseTechnologies, Inc.), 25 μg of Taurodeoxycholic acid (Sigma, Cat#T-0875), 50 mM sodium acetate (pH 5.2), and 5-10 L of crude cell lysatecontaining a wild-type or mutant EGCase. The hydrolytic mixture wasincubated at 37° C. for 10 to 120 minutes.

Example III: Synthetic Assays

An exemplary synthetic reaction had a volume of 50 μL, containing 5 mMMgCl₂, 0.5% detergent, 0.3 mM ceramide-C-18 (pre-dried), 20 mM Tris-HCl(pH 7.5), and 0.36 mM 3′ sialyl lactose fluoride (3′ SLF). Thedetergents used in the reaction were Triton-X100 (0.5%),Taurodeoxycholic acid (25 μg), NP-40 (0.5%), Tween-80 (0.5%), 3-14Zwittergent (0.5%), and Triton-CF54 (0.5%). The reaction times rangedfrom 2 to 16h in various buffers ranging in pH from 5.2 to 8.0.

Example IV: TLC Analysis

5 μL of a hydrolytic or synthetic reaction was spotted on a TLC plate.The plate was then dried with a hair dryer set on low. The plate was runin an appropriate solvent system (solvent A: chloroform/methanol at 95:5v/v, solvent B: 1-butyl alcohol/acetic acid/H₂O at 2:1:1 v/v, solvent C:chloroform/methanol/H₂O/ammonium hydroxide at 60:40:5:3). The plate wasthen dried and stained with anisaldehye. The TLC plate was subsequentlydeveloped by heating on a hot plate set at three.

Example V

The following example illustrates the successful generation of aglycosynthase enzyme capable of performing the efficient glycosidiccoupling between 3′-sialyllactosyl fluoride and a variety of lipidacceptors by performing selected modifications on theendoglycoceramidase II enzyme from Rhodococcus M-777 (SEQ ID NO:3).

Cloning of Exemplified Mutant Endoglycoceramidase E351S

The DNA sequence of the wild-type EGCase gene from Rhodococcus was usedas a template for the design of the construct. Using an overlapping PCRstrategy, an amino acid substitution of serine for glutamic acid atamino acid position 351 relative to the wild-type enzyme was engineeredinto the coding sequence (see, primer sequences SEQ ID NOs:40-47). Thefinal coding sequence was also truncated at amino acid 29 relative tothe wild-type enzyme in order to mimic the mature version of the enzymethat is normally generated during secretion (SEQ ID NOs:48 and 49).Restriction sites were engineered onto the ends of the coding sequence(Nde1 and Xho1, respectively) in order to ligate to the correspondingsites in frame with the six his tag from the pET28A vector (Novagen/EMDBiosciences, San Diego Calif.). This construct was confirmed to becorrect by restriction and sequence analysis and then was used totransform the E. coli strain BL21(DE3) (Novagen) using 50 mcg/mlKanamycin selection. An individual colony was used to inoculate aculture of Maritone-50 mcg/ml Kanamycin that was incubated for 16 hrs at37° C. A sample of culture was mixed to achieve 20% glycerol andaliquots were frozen at −80 OC and referred as stock vials.

Mutant Endoglycoceramidase (EGC) Expression and Purification

Wild-type EGC and the following EGC mutants; E351A, E351D, E351D, E351G,and E351S have been successfully expressed and purified. The expressionlevels for the EGC variants are quite high, therefore cell cultures of50 ml were used to produce the enzymes.

Cells from a −80° C. freezer stock were directly inoculated into 50 mlTyp broth and were grown at 37° C. to saturation. The temperature wasthen lowered to 20° C. and protein production was induced by addition ofIPTG to 0.1 mM (due to solubility issues, the E351G mutant was expressedat an IPTG concentration of 0.05 mM to prevent aggregation). After 8-12hours, the cells are harvested by centrifugation and the pellet wasresuspended in 2.5 ml BugBuster protein extraction reagent (Novagen).Cell lysis was allowed to proceed for 20 min, and the cell debris wasthen removed by centrifugation.

The cell lysate was then applied to a 1 ml Ni-NTA column (Amersham),which was then washed with two column volumes of binding buffer (20 mMsodium phosphate, pH 7.0, containing 0.5 M NaCl). EGC was eluted by thestepwise addition of imidazole to a final concentration of 0.5 M (EGCelutes between 0.2 and 0.3 M imidazole). Fractions containing EGC wereidentified by SDS-PAGE. The purification gave a protein of >95% purityafter a single step. The expression and purification of exemplifiedRhodococcus EGC mutant E351S is depicted in FIG. 16.

Fractions containing EGC were pooled and the buffer was changed to 25 mMNaOAc, pH 5.0, containing 0.2% Triton X-100 using an Amicon centrifugalultrafiltration device (MWCO=10,000 Da). At this time, the protein wasconcentrated to a final volume of approximately 2 ml.

Protein concentration was then assessed using the Bradford method. Thepurification generally yielded about 10 mg EGC (180-200 mg per liter ofexpression culture). The enzyme was stable in this form for at least 3months.

Enzymatic Synthesis of Lyso-GM₁ by Mutant EGC Enzymes

Reactions were performed in 25 mM NaOAc (pH 5.0) containing 0.1-0.2%Triton X-100. A typical reaction mixture contained approximately 50mg/ml of a fluorinated GM1 sugar donor (GM1-F), 15 mg/ml of an acceptorsphingosine, and 2.0 mg/ml of the appropriate EGC mutant in a totalreaction volume of 50 μl. Under these conditions, the reaction proceedsto >90% completion within 12 hours at 37° C. based on TLC analysis.Transfer of the fluorinated GM1 sugar donor was monitored using an HPLCreverse phase method on a Chromolith RP-8e column with eluants of 0.1%trifluoroacetic acid (TFA) in acetonitrile (ACN) to 0.1% TFA in H₂O.Exemplified results of HPLC monitoring of a glycosynthase reaction for aRhodococcus E351S mutant is depicted in FIG. 17.

Enzymatic Synthesis of Lyso-GM₃ by Mutant EGC Enzymes

Reactions were performed in 25 mM NaOAc (pH 5.0) containing 0.2% TritonX-100. A typical reaction mixture contained approximately 10 mM3′-sialyllactosyl fluoride (3′-SLF), 20 mM of the acceptorD-erythro-sphingosine, and 0.5 mg/ml of the appropriate EGC mutant in atotal reaction volume of 100 μl. Under these conditions, the reactionproceeds to >90% completion within 12 hours at 37° C. based on TLCanalysis. In addition to D-erythro-sphingosine, Table 1, above, showsthe structures of other acceptor species that have been used inglycosynthase reactions with 3′-SLF.

Essentially all of the 3′-SLF was consumed in the enzymatic reactionwith D-erythro-sphingosine. Thus this reaction delivered a conservativeestimate of a minimum of 90% turnover with respect to 3′-SLF. Runningsolvent was CHCl₃/MeOH/0.2% CaCl₂ (5:4:1), with detection byorcinol-H₂SO₄ stain. Purification of the lyso-GM₃ product was achievedusing a combination of normal phase and reversed phase SepPak cartridges(Waters). The identity of the product as lyso-GM₃ was supported by massspectrometry and NMR.

Example VI: Kinetic Parameters of Wild-Type Rhodococcus M-777Endoglycoceramidase

Using 2,4-dinitrophenyl lactoside as a substrate, the Rhodococcus M-777EGC enzyme has a Km of approximately 2 mM, and a kcat of 90 min-1 (FIG.18). The dependence of the activity on detergent concentration was alsoinvestigated. It was found that in the absence of detergent, the rate ofhydrolysis was very low. With the addition of Triton X-100 to 0.1%, thekcat/Km increased dramatically, and gradually decreased with furtheradditions of detergent. The dependence of kcat/Km on detergentconcentration leveled off at concentrations greater than 0.5%;increasing the detergent concentration caused a steady increase in bothkcat and Km up to a concentration of 1% (FIGS. 19A-C). The pH dependenceof the hydrolysis activity was also investigated. As expected, themaximal kcat/Km is observed around pH 5 (FIG. 20).

Example VII: Expression of Wild-Type Propionibacterium acnesEndoglycoceramidase in E. coli

The expression level of P. acnes EGC enzyme was extremely high, likelyexceeding 200 mg/l. However, the expressed protein exclusively formedinclusion bodies under a variety of conditions. This propensity to forminclusion bodies is also observed for the Rhodococcus enzyme, but it ispossible to minimize this tendency using Tuner cells in conjunction witha low induction temperature (<20° C.) and low concentration of IPTG (0.1mM). These tactics proved unsuccessful with the P. acnes enzyme.Furthermore, the P. acnes enzyme was found to express at a very highlevel even in the absence of IPTG, with inclusion bodies forming duringthe pre-induction growth phase.

A series of experiments was performed to try to bring at least someprotein into the soluble fraction, including:

-   -   variation of induction temperature (16-37° C.) in conjunction        with variation of [IPTG] (0-0.1 mM);    -   pre-induction growth at room temperature to lower the levels of        background expression;    -   transformation into BL21 pLysS (to suppress background        expression) with variation of conditions as described above;    -   expression from a lac promoter rather than the T7 system with        the above variations;    -   heat shock of the cells prior to induction (42° C. and 60° C.        for 2 min in separate experiments) to induce chaperone        expression;    -   adding a pelB signal sequence to direct secretion into the        periplasm; and    -   attempts were also made to resolubilize the inclusion by        denaturation with either urea (8 M) or guanidinium HCL (2 M) as        the chaotropic agent followed by either iterative lowering of        the denaturant concentration by dialysis or removal of the        denaturant by first adsorbing the protein onto a Ni-NTA column        and then decreasing the denaturant concentration using a linear        gradient.

Soluble P. acnes EGC was obtained by performing the growth and inductionsteps in M9 minimal medium using Tuner cells with induction overnight at18° C. in the presence of 0.1 mM IPTG (essentially the same conditionsused for the Rhodococcus, except with minimal media rather than rich)(FIG. 21, lane 6). In a simultaneous experiment using BL21 pLysS as theexpression strain, inclusion bodies were formed, presumably due to theaction of the lactose permease in increasing the internal IPTGconcentration to a level where expression still proceeds at a very highrate even in minimal media. Simultaneously employing the following threetactics lowered the rate of protein production sufficiently to obtainsoluble P. acnes EGC enzyme while retaining the Histag: (i) minimalmedia for growth and expression, (ii) a very low IPTG concentration, and(iii) expression in the lactose permease deficient Tuner cells. Underthese conditions, hydrolysis activity on both 2,4-dinitrophenyllactoside and GM3 ganglioside in the cell extract was detected.

A gene construct for an E319S mutant EGC was prepared in parallel withthe wild-type sequence. This mutant enzyme catalyzed the glycosynthasereaction as well.

It is understood that the examples and embodiments described herein arefor illustrative purposes only and that various modifications or changesin light thereof will be suggested to persons skilled in the art and areto be included within the spirit and purview of this application andscope of the appended claims. All publications, patents, and patentapplications cited herein are hereby incorporated by reference in theirentirety for all purposes.

1. (canceled)
 2. A mutant endoglycoceramidase, wherein the mutantendoglycoceramidase comprises a wild-type endoglycoceramidase peptidesequence including a nucleophilic region as set forth in SEQ ID NO: 54modified by replacing a nucleophilic carboxylate amino acid residue ofthe nucleophilic region with a serine (Ser), glycine (Gly), or alanine(Ala) amino acid residue.
 3. The mutant endoglycoceramidase of claim 2,wherein the mutant is capable of catalyzing the transfer of a saccharidemoiety from a donor substrate to an acceptor substrate selected from asphingosine, a ceramide, or an analog thereof, thereby producing aglycolipid.
 4. The mutant endoglycoceramidase of claim 2, wherein thecorresponding wild-type endoglycoceramidase comprises an amino acidsequence set forth in SEQ ID NOs: 3, 6, 8, 11, 14, 17, 20, 23, 25, 26,27, or
 28. 5. The mutant endoglycoceramidase of claim 2, comprising anyone of the amino acid sequences set forth in SEQ ID NOS: 55-66.
 6. Anucleic acid encoding the mutant endoglycoceramidase of claim
 2. 7. Avector comprising the nucleic acid of claim
 6. 8. A host cell comprisingthe nucleic acid of claim
 6. 9. A host cell comprising the vector ofclaim
 7. 10. A method of producing a mutant endoglycoceramidase,comprising growing the host cell of claim 8 under conditions suitablefor expression of the mutant endoglycoceramidase.
 11. A host cellexpressing the mutant endoglycoceramidase of claim
 2. 12. The mutantendoglycoceramidase of claim 3, wherein the mutant catalyzes thetransfer of the saccharide moiety from the donor substrate to theacceptor substrate at a rate that exceeds hydrolysis of the glycolipid.13. The mutant endoglycoceramidase of claim 3, wherein the mutantexhibits increased catalytic activity in the transfer of the saccharidemoiety from the donor substrate to the acceptor substrate as compared tothe wild-type endoglycoceramidase.
 14. The mutant endoglycoceramidase ofclaim 3, wherein the mutant exhibits decreased catalytic activity inhydrolyzing the glycolipid as compared to the wild-typeendoglycoceramidase.
 15. A method of producing a glycolipid, the methodcomprising: contacting a donor substrate comprising an activatedsaccharide moiety and an aglycone acceptor substrate with the mutantendoglycoceramidase of claim 2 in a reaction mixture.
 16. The method ofclaim 15, wherein the mutant catalyzes the transfer of the saccharidemoiety from the donor substrate to the acceptor substrate at a rate thatexceeds hydrolysis of the glycolipid.
 17. The method of claim 15,wherein the mutant endoglycoceramidase exhibits increased catalyticactivity in the transfer of the saccharide moiety from the donorsubstrate to the acceptor substrate as compared to the wild-typeendoglycoceramidase.
 18. The method of claim 15, wherein said mutantexhibits decreased catalytic activity in hydrolyzing the glycolipid ascompared to the wild-type endoglycoceramidase.
 19. The method of claim15, wherein the acceptor substrate is sphingosine.
 20. The method ofclaim 15, wherein the acceptor substrate is a sphingosine analog havinga structure defined by the formula:

wherein: Z is O, S, C(R²)₂ or NR²; X is H, —OR³, —NR³R⁴, CR³, or—CHR³R⁴; R¹, R², R³ and R⁴ are independently selected from H,substituted or unsubstituted alkyl, substituted or unsubstitutedheteroalkyl, substituted or unsubstituted aryl, substituted orunsubstituted heteroaryl, substituted or unsubstituted heterocycloalkyl,—C(=M)R⁵, —C(=M)-Z¹—R⁵, —SO₂R⁵, or —SO₃; wherein M and Z¹ areindependently selected from O, NR⁶ or S; and R⁵, R⁶, R⁷ and R⁸ areindependently selected from H, substituted or unsubstituted alkyl,substituted or unsubstituted heteroalkyl, substituted or unsubstitutedaryl, substituted or unsubstituted heteroaryl, substituted orunsubstituted heterocycloalkyl; Y is H, —OR⁷, —SR⁷, —NR⁷R⁸, substitutedor unsubstituted alkyl, substituted or unsubstituted heteroalkyl,substituted or unsubstituted aryl, substituted or unsubstitutedheteroaryl, or substituted or unsubstituted heterocycloalkyl; and R^(a),R^(b), R^(c) and R^(d) are independently selected from H, substituted orunsubstituted alkyl, substituted or unsubstituted heteroalkyl,substituted or unsubstituted aryl, substituted or unsubstitutedheteroaryl, or substituted or unsubstituted heterocycloalkyl.
 21. Themethod of claim 15, wherein the acceptor substrate is a sphingosineanalog selected from D-erythro-sphingosine, D-erythro-sphinganine,L-threo-sphingosine, L-threo-dihydrosphingosine,D-erythro-phytosphingosine, and N-ocatanoyl-D-erythro-sphingosine. 22.The method of claim 15, wherein the acceptor substrate is a moleculehaving a structure selected from:


23. The method of claim 15, wherein the donor substrate is a glycosylfluoride.
 24. The method of claim 15, wherein the glycolipid is aganglioside selected from disialoganglioside (GD_(1a), GD_(1α), GD_(1b),GD₂, GD₃), galactosylganglioside (Gg₃, Gg₄),monosialyltetrahexosylceramide (GH₁, GH₂, GH₃), monosialoganglioside(GM₁, GM_(1b), GM₂, GM₃, Fuc-GM₁), pentasialoganglioside (GP₁, GP₂,GP₃), tetrasialoganglioside (GQ_(1b), GQ_(1B), GQ_(1β), GQ_(1c), GQ₂,GQ₃), trisialoganglioside (GT_(1a), GT_(1b), GT_(1c), GT_(1β), GT_(1c),GT₂, and GT₃).
 25. The method of claim 15, wherein the glycolipid ismonosialoganglioside 1 (GM₁).
 26. The method of claim 25, wherein theglycolipid is lysomonosialoganglioside 1 (lyso-GM1).